Enum vs trait dispatch
Welcome to Phase 4 — Execution & Memory, the deepest pass. Phase 3 ended at the
plan/runtime boundary; now we cross it and watch records actually move. This first lesson
revisits a question from Phase 3 — “how do you call into one of many kinds?” — and finds the
engine answering it the opposite way from the IO seam. Back in lesson 3.1 the format
layer used Box<dyn FormatReader>: dynamic dispatch, because formats are an open-ended,
runtime-chosen plug-in seam. The DAG executor faces the same shape of problem — many kinds of
node — and deliberately chooses a closed enum and one exhaustive match instead.
You’ll be able to: explain when closed-enum dispatch beats trait-object dispatch, read the
executor’s central match, and say why the engine has no dyn Operator anywhere.
The closed set of node kinds
Section titled “The closed set of node kinds”A compiled plan is a DAG of nodes, and a node is one of a fixed, engine-known set of kinds:
a source, a transform, a route, a merge, a sort, an aggregation, an output, and a handful more.
That set is a closed enum — the same PlanNode you glimpsed in Phase 3:
clinker-plan ·mod.rs ·PlanNode type @47d2e12
pub enum PlanNode { Source { /* ... */ }, Transform { /* ... */ }, Route { /* ... */ }, Merge { /* ... */ }, Sort { /* ... */ }, Aggregation { /* ... */ }, Output { /* ... */ }, // ... 13 variants in all — the complete vocabulary of pipeline nodes}The key word is closed. The kinds of node a pipeline can contain are decided by the engine, not by users, not at run time. Contrast that with formats: anyone can add a new wire format, and which one a job uses is read from the plan at run time. Formats are open; node kinds are closed. That single difference drives the whole dispatch decision.
One exhaustive match, no dyn
Section titled “One exhaustive match, no dyn”Because the set is closed, the executor dispatches with a single match over the enum. Each arm
hands the node to its operator module:
clinker-exec ·dispatch.rs ·dispatch_plan_node fn @47d2e12
pub(crate) fn dispatch_plan_node( ctx: &mut ExecutorContext<'_>, current_dag: &ExecutionPlanDag, node_idx: NodeIndex,) -> Result<(), PipelineError> { let node = current_dag.graph[node_idx].clone(); match node { PlanNode::Source { .. } => dispatch_source(ctx, current_dag, node_idx, &node)?, PlanNode::Transform { .. } => dispatch_transform(ctx, current_dag, node_idx, &node)?, PlanNode::Route { .. } => dispatch_route(ctx, current_dag, node_idx, &node)?, PlanNode::Merge { .. } => dispatch_merge(ctx, current_dag, node_idx, &node)?, // ... one arm per variant — and crucially, NO `_ =>` catch-all } Ok(())}There is no trait object here and none anywhere in the engine: searching the whole codebase
for dyn Operator or trait Operator finds nothing. Operators are not boxed behind a vtable;
they’re arms of a match. And there is no _ => wildcard — the match spells out every
variant. That second detail is doing real architectural work, as the next section shows.
Why closed-enum dispatch wins here
Section titled “Why closed-enum dispatch wins here”Lesson 3.1 argued dynamic dispatch was right for the IO seam. Here the trade flips, for three concrete reasons:
- Exhaustiveness is a feature. Because the
matchhas no catch-all, adding a newPlanNodevariant makes every non-exhaustive match a compile error — the compiler hands you the exact list of sites to update (exactly the guarantee from lesson 2.2). Withdyn Operator, a forgotten case would be a run-time surprise, not a build failure. - The set is closed and known at compile time. A
dynseam exists to let outside code plug in types the engine never named. Node kinds are all named by the engine itself, so the open-endednessdynbuys is worthless here — and you’d pay a vtable indirection per node for it. - Operators want specialized data. Each arm can pull the variant’s own payload (a sort’s keys, a transform’s compiled program) by pattern-matching, with no downcasting.
The mental model: dyn is for open sets chosen by others at run time; a closed enum is for
sets the engine owns and wants the compiler to police. The IO seam is the former; the operator
set is the latter. Same language feature family (traits and enums), opposite tool for opposite
jobs.
Feel the exhaustiveness
Section titled “Feel the exhaustiveness”Here are both strategies side by side. The enum version makes the compiler your checklist; the trait-object version quietly accepts a new kind with no nudge:
> output appears here — press Run
Add a Sort variant and the build breaks at dispatch, pointing at the exact code that doesn’t
yet handle it. That is the compiler acting as a complete, always-current checklist of “every
place a new operator must be wired in.” A Box<dyn Operator> design would have compiled fine and
failed later, at run time, when an unhandled node showed up.
// quick check
Why does the executor dispatch DAG nodes through a closed enum + exhaustive match instead of Box<dyn Operator>?
dyn exists to let outside code plug in unnamed types at run time (the IO seam). The operator set is closed and engine-owned, so a closed enum + wildcard-free match turns 'add an operator' into a compiler-checked checklist, with no dynamic-dispatch indirection.
Read the dispatcher
Section titled “Read the dispatcher”You’ve seen how the executor decides which operator to run. Next: how it runs them at once — the threads that drive each source and the bounded channels that connect them.