Through the DAG
--explain told you customer_etl is a graph of four nodes — DAG nodes: 4. The plan
describes the graph; the executor walks it. This lesson is how a record actually
travels from one node to the next. Shallow now; Phase 4 is the deep version.
You’ll be able to: describe the path a record takes through the DAG, and explain how the engine picks the right handler for each node.
The graph is a chain of nodes
Section titled “The graph is a chain of nodes”For customer_etl, the DAG is a straight line — each node feeds the next:
source transform transform outputcustomers ──▶ active_only ──▶ final_flag ──▶ results (read CSV) (add is_active) (add tier) (write CSV)Every node is one variant of a single closed enum, PlanNode — source, transform,
output, and a handful of others (aggregate, route, combine…):
clinker-plan ·mod.rs ·PlanNode type @47d2e12
pub enum PlanNode { Source { /* ... */ }, Transform { /* ... */ }, Route { /* ... */ }, Merge { /* ... */ }, Sort { /* ... */ }, Aggregation { /* ... */ }, Output { /* ... */ }, // ... one variant per kind of node a pipeline can contain}One match, every node kind
Section titled “One match, every node kind”The executor walks the nodes and, for each, has to run the right logic — read for a
source, evaluate CXL for a transform, write for an output. It does that with a single
big match over the PlanNode enum:
clinker-exec ·dispatch.rs ·dispatch_plan_node fn @47d2e12
// the shape of it (real arms call into per-operator modules)match node { PlanNode::Source { .. } => /* read records */, PlanNode::Transform { .. } => /* run the CXL */, PlanNode::Output { .. } => /* write records */, // ... one arm per node kind}Because PlanNode is a closed set known at compile time, this match must handle
every kind — the compiler refuses to build if a node type is left unhandled. There’s no
plugin registry and no runtime lookup of “which handler?”; the set of operations is
fixed and exhaustive. Why Clinker chose a closed enum over open, pluggable operators
— and what that trades away — is one of the central decisions you’ll examine in Phase 4.
How a record moves
Section titled “How a record moves”Records don’t teleport to the end; each node hands its outputs to the next. The source
produces records and passes them downstream; active_only receives each one, runs its
CXL, and passes the (now slightly larger) record on; final_flag does the same; the
output writes whatever reaches it. One record at a time, in a streaming flow — which is
exactly why --explain reported Mode: Streaming.
// quick check
How does the executor choose what to do for each node in the DAG?
Dispatch is a match over the closed PlanNode enum (dispatch_plan_node). The node kinds are fixed and known at compile time, so the compiler enforces that every kind is handled. The trade-offs of closed-vs-open dispatch are a Phase 4 topic.
Checkpoint
Section titled “Checkpoint”You’ve seen the parts: records, the plan, the graph, the dispatch. Time to put one record through all of them at once.