Smart pointers — Box, Rc, Arc
You’ve now seen Arc three times — sharing a schema, a document context, a filename. This
lesson makes the smart pointers explicit: Box (put one thing on the heap), Rc and
Arc (let many owners share one thing). The engine’s choices among them are deliberate,
and one of them — Arc over Rc — is forced by how Clinker runs.
You’ll be able to: explain what Box, Rc, and Arc each give you, and why the
engine shares schemas and document context with Arc.
Box: one owner, on the heap
Section titled “Box: one owner, on the heap”A Box<T> puts a T on the heap and owns it. You already met its main job in lesson 2.1:
Map(Box<...>) boxes the map so the Value enum stays small — the variant holds a
pointer, not the whole map. Box is also what makes recursive types possible (a Value
can contain Values) — without the indirection, the type would be infinitely sized.
Rc vs Arc: many owners, one allocation
Section titled “Rc vs Arc: many owners, one allocation”Sometimes many things need to share the same data — like every record of a file sharing
one Schema. That’s what Rc and Arc do: they wrap a value in a reference count, hand
out cheap clones (each clone just bumps the count), and free the value when the last owner
drops. Cloning an Arc<Schema> doesn’t copy the schema — it copies a pointer and adds one.
The difference between them is thread safety. Rc’s counter is a plain integer — fast,
but unsafe to touch from two threads. Arc’s counter is atomic — safe across threads,
at a tiny cost. Clinker reads each source on its own thread and runs heavy operators on a
thread pool, so records and the things they share must be able to cross threads. That’s why
the engine uses Arc everywhere it shares — Rc simply wouldn’t be allowed to.
You can see it in the per-document context that records share:
clinker-record ·document_context.rs ·DocumentContext type @47d2e12
pub struct DocumentContext { id: DocumentId, grain: DocumentGrain, source_file: Arc<str>, // shared per source file — every record points here // ...}and in the string storage behind Value::String, which uses an Arc to share long
strings across clones (its full design — three storage strategies in one type — is a
Phase 4 deep dive):
clinker-record ·field_str.rs ·FieldStr type @47d2e12
/// A field-value string stored inline, `Arc`-shared, or `Box`-unique/// behind a single `str` API. 24 bytes wide.pub struct FieldStr { repr: Repr, // inline bytes, an Arc<str> (shared), or a Box<str> (unique)}Share without copying
Section titled “Share without copying”> output appears here — press Run
Three handles, one Schema allocation. That’s how a million records carry “their”
schema for free.
// quick check
Why does Clinker share data with Arc rather than Rc?
Clinker runs sources on separate threads and operators on a pool, so shared values must be Send/Sync. Arc's atomic counter is safe across threads; Rc's is not.
Checkpoint
Section titled “Checkpoint”You can share data cheaply. The final piece of the data layer ties ownership, borrowing, and sharing together: lifetimes, and the zero-copy reads they make safe.