Trace one record end-to-end
You have all the pieces now — Value, Record, Schema, the CompiledPlan, the DAG,
the dispatch. This lesson threads them together by following one customer, Alice,
from a line in a CSV file to a line in the output, naming every part of the engine she
touches.
You’ll be able to: narrate a record’s full journey through customer_etl using the
right names for each stage — the skill the whole rest of the curriculum builds on.
Alice’s row
Section titled “Alice’s row” clinker ·customer_etl.yaml example @47d2e12
In the source CSV she’s one line:
customer_id,first_name,last_name,email,status,lifetime_value,zip_code1001,Alice,Chen,alice.chen@acme.com,active,15200,94103Stage 1 — the source reads her
Section titled “Stage 1 — the source reads her”The source node reads the CSV and produces a Record: a Vec<Value> bound to the
schema declared in the YAML. Every field is a string at this point:
clinker-record ·mod.rs ·Record type @47d2e12
schema: [customer_id, first_name, last_name, email, status, lifetime_value, zip_code]values: [ "1001", "Alice", "Chen", ..., "active", "15200", "94103" ]Stage 2 — active_only flags her
Section titled “Stage 2 — active_only flags her”The first transform runs this CXL over every record:
emit is_active = status == "active"Alice’s status is "active", so the comparison is true. The transform emits a new
field — is_active = Bool(true) — and passes the enlarged record downstream. Her row
now carries an eighth value.
Stage 3 — final_flag tiers her
Section titled “Stage 3 — final_flag tiers her”The second transform:
emit tier = if lifetime_value.to_int() > $vars.gold_threshold then "gold" else "standard"Here lifetime_value ("15200") is finally turned into a number — .to_int() — and
compared against $vars.gold_threshold (default 10000). 15200 > 10000 is true, so
tier = "gold". This is the coercion we promised back in lesson 1.1: the string becomes
an integer exactly when a transform needs it to, not before.
Stage 4 — the output writes her
Section titled “Stage 4 — the output writes her”The output node writes the record — now with is_active and tier added — to
./output/customers.csv. Alice leaves the engine as:
... ,active,15200,94103,true,goldWatch it happen
Section titled “Watch it happen”--dry-run runs the real thing and writes to your terminal:
cd examples/pipelinescargo run -p clinker -- run customer_etl.yaml --dry-run -n 5You saw the summary in Phase 0: 5 total, 5 ok, 5 written, 0 dlq. Alice is one of those
five — ok and written. (Carol, who is inactive, still flows through; active_only
just flags her is_active = false. Nothing is dropped here — filtering is a later
topic.)
Your turn: narrate Bob
Section titled “Your turn: narrate Bob”💡 Hint 1
Apply the same two CXL rules. Is Bob’s status "active"? Is his lifetime_value (as
an integer) greater than the gold_threshold of 10000?
Show solution
Bob’s status is "active", so is_active = true. His lifetime_value is 8400, and
8400 > 10000 is false, so tier = "standard". He leaves as
...,active,8400,10001,true,standard.
// quick check
At which stage does Alice's lifetime_value stop being a string and become a number?
The source reads everything as Value::String. The conversion happens in final_flag, where lifetime_value.to_int() is needed for the comparison — coercion on demand, not up front.
That’s Phase 1
Section titled “That’s Phase 1”You can now follow a record from source to output and name every stage: a Record of
Values bound to a Schema, produced by a source, transformed node-by-node as the
executor walks the CompiledPlan’s DAG, and written by an output. That mental map is
the spine of everything ahead.
Phase 2 — Data & Representation revisits the first stage in depth: what a Value
really costs, how records borrow instead of copy, and the ownership and lifetime rules
that make the engine fast. Same journey, deeper each pass.