Add a CXL builtin
CXL is the small expression language you met in lesson 3.7 — emit full_name = first_name + " " + last_name, compiled once and evaluated per record. Its string and numeric
methods (upper(), trim(), length(), …) are builtins. Adding one is the
smallest real contribution you can make to the engine, and it’s the perfect lesson in a
subtle architectural fact: a builtin lives in two separate tables, keyed by the same
method name, and nothing in the type system forces you to update both.
You’ll be able to: name the two places a builtin is defined and which compiler stage reads each, write the signature entry and the eval arm for a new method, and explain the silent failure mode of forgetting one of the two.
A builtin is a signature plus an implementation — stored apart
Section titled “A builtin is a signature plus an implementation — stored apart”Recall CXL’s staged pipeline: parse → resolve → typecheck → eval (lesson 3.7). A builtin has to show up in two of those stages, and clinker stores those two halves in two different files:
- The signature, consulted at typecheck — what type does
s.upper()return? It lives in a registry ofBuiltinDefrecords. - The implementation, run at eval — what does
upper()actually do to the value? It lives in a bigmatchon the method name.
Start with the signature side. The registry is not a list of function pointers and not a
trait — it’s two hash maps of BuiltinDef records, built once:
cxl ·builtins.rs ·BuiltinDef type @47d2e12
pub struct BuiltinDef { pub name: &'static str, pub receiver: TypeTag, // the type the method is called on pub args: Vec<TypeTag>, // expected argument types pub min_args: usize, pub max_args: Option<usize>, pub return_type: TypeTag, // what typecheck records for the call pub category: Category,} cxl ·builtins.rs ·BuiltinRegistry type @47d2e12
/// Registry of all built-in methods and window functions.pub struct BuiltinRegistry { methods: ahash::HashMap<&'static str, BuiltinDef>, window_fns: ahash::HashMap<&'static str, BuiltinDef>,}BuiltinRegistry::new() fills methods imperatively. String methods are declared
through a little closure s(...) and an array, so each entry is a single readable line —
upper takes no args and returns a String:
let s = |name, args, min, max, ret| (name, BuiltinDef { name, receiver: TypeTag::String, args, min_args: min, max_args: max, return_type: ret, category: Category::String,});for (n, d) in [ s("upper", vec![], 0, Some(0), TypeTag::String), s("lower", vec![], 0, Some(0), TypeTag::String), // ...24 string methods in all] { methods.insert(n, d); }Notice what BuiltinDef does not have: any field holding the implementation. The
registry knows a method’s shape, never its behavior.
The implementation lives in a separate match
Section titled “The implementation lives in a separate match”The behavior is in a different file — the eval kernel’s dispatch_method, a match on
the method-name string that returns Ok(None) for anything it doesn’t recognize:
cxl ·builtins_impl.rs ·dispatch_method fn @47d2e12
/// Dispatch a method call on a receiver value./// Returns None if the method is not a known built-in (caller should error).pub fn dispatch_method( receiver: &Value, method: &str, args: &[Value], regex: Option<&Regex>, span: Span, ctx: &EvalContext<'_>,) -> Result<Option<Value>, EvalError> { // ...null propagation first... match method { "upper" => Ok(Some(string_op(receiver, span, |s| { Value::String(s.to_uppercase().into()) }))), "lower" => Ok(Some(string_op(receiver, span, |s| { /* ... */ }))), // ...one arm per builtin... _ => Ok(None), // unknown method → caller raises an error }}So "upper" appears twice, in two files: once as a BuiltinDef in builtins.rs
(its return type is String), and once as a match arm in builtins_impl.rs (it
uppercases). The two are linked only by the string "upper" — there is no shared enum,
no trait, nothing the compiler checks.
Why this matters: the silent half-add
Section titled “Why this matters: the silent half-add”Because the two tables are independent, you can update one and forget the other, and the program still compiles. The failure shows up at runtime or as a missing type:
- Signature only, no eval arm: typecheck is happy (it found the return type), but
evaluation hits the
_ => Ok(None)fall-through and the caller raises “unknown method.” - Eval arm only, no signature: evaluation works, but typecheck never finds a
BuiltinDef, so the call’s type silently falls back toAny(for scalar methods, clinker’s typechecker reads onlyreturn_typefrom the registry and does not even enforce arg types — so a missing signature degrades quietly rather than erroring).
Here is that two-table coupling as a runnable toy. The signature table and the impl match are keyed by the same string; comment out one half and watch a method become half-defined:
> output appears here — press Run
The lesson the toy makes concrete: adding a builtin is a two-file change, and the checklist for “did I really add it?” is human discipline plus a test — not the compiler.
The change-set, and the test that proves it
Section titled “The change-set, and the test that proves it”To add a scalar method end-to-end you edit two files:
crates/cxl/src/builtins.rs— add one entry toBuiltinRegistry::new()(pick the right category helper / array) so typecheck knows the receiver, args, and return type.crates/cxl/src/eval/builtins_impl.rs— add onematcharm todispatch_methodreturningOk(Some(value)), reusing helpers likestring_op.
(A closure-bearing method like map/filter, or a window function, needs extra
wiring in eval/compiled.rs and the window_fns table — out of scope here; the scalar
case is the clean first contribution.) Then prove it through the full stack — a test that
parses, resolves, typechecks, and evaluates a program that calls the method:
cxl ·tests.rs ·string_methods test @47d2e12
#[test]fn string_methods() { // drives whole programs like emit out = s.upper() through // parse -> resolve -> typecheck -> compile -> eval, asserting // s.upper() on "abc" yields Value::String("ABC"), etc.}// quick check
You add a BuiltinDef for a new method `slugify` to BuiltinRegistry::new() but forget to add an arm to dispatch_method. What happens when a pipeline calls s.slugify()?
The two tables are linked only by the method-name string, with no compile-time check between them. A signature without an eval arm typechecks fine, then fails at eval via the catch-all. That gap is exactly why a parse-to-eval test (like string_methods) is the real proof you added a builtin, not just half of one.
Add one yourself
Section titled “Add one yourself”You’ve extended the expression language. Next: extend the engine’s edges — add a whole new file format behind the reader/writer seam.