Skip to content

Add a CXL builtin

CXL is the small expression language you met in lesson 3.7 — emit full_name = first_name + " " + last_name, compiled once and evaluated per record. Its string and numeric methods (upper(), trim(), length(), …) are builtins. Adding one is the smallest real contribution you can make to the engine, and it’s the perfect lesson in a subtle architectural fact: a builtin lives in two separate tables, keyed by the same method name, and nothing in the type system forces you to update both.

You’ll be able to: name the two places a builtin is defined and which compiler stage reads each, write the signature entry and the eval arm for a new method, and explain the silent failure mode of forgetting one of the two.

A builtin is a signature plus an implementation — stored apart

Section titled “A builtin is a signature plus an implementation — stored apart”

Recall CXL’s staged pipeline: parse → resolve → typecheck → eval (lesson 3.7). A builtin has to show up in two of those stages, and clinker stores those two halves in two different files:

  1. The signature, consulted at typecheck — what type does s.upper() return? It lives in a registry of BuiltinDef records.
  2. The implementation, run at eval — what does upper() actually do to the value? It lives in a big match on the method name.

Start with the signature side. The registry is not a list of function pointers and not a trait — it’s two hash maps of BuiltinDef records, built once:

cxl ·builtins.rs ·BuiltinDef type @47d2e12
pub struct BuiltinDef {
pub name: &'static str,
pub receiver: TypeTag, // the type the method is called on
pub args: Vec<TypeTag>, // expected argument types
pub min_args: usize,
pub max_args: Option<usize>,
pub return_type: TypeTag, // what typecheck records for the call
pub category: Category,
}
cxl ·builtins.rs ·BuiltinRegistry type @47d2e12
/// Registry of all built-in methods and window functions.
pub struct BuiltinRegistry {
methods: ahash::HashMap<&'static str, BuiltinDef>,
window_fns: ahash::HashMap<&'static str, BuiltinDef>,
}

BuiltinRegistry::new() fills methods imperatively. String methods are declared through a little closure s(...) and an array, so each entry is a single readable line — upper takes no args and returns a String:

let s = |name, args, min, max, ret| (name, BuiltinDef {
name, receiver: TypeTag::String, args,
min_args: min, max_args: max, return_type: ret, category: Category::String,
});
for (n, d) in [
s("upper", vec![], 0, Some(0), TypeTag::String),
s("lower", vec![], 0, Some(0), TypeTag::String),
// ...24 string methods in all
] { methods.insert(n, d); }

Notice what BuiltinDef does not have: any field holding the implementation. The registry knows a method’s shape, never its behavior.

The implementation lives in a separate match

Section titled “The implementation lives in a separate match”

The behavior is in a different file — the eval kernel’s dispatch_method, a match on the method-name string that returns Ok(None) for anything it doesn’t recognize:

cxl ·builtins_impl.rs ·dispatch_method fn @47d2e12
/// Dispatch a method call on a receiver value.
/// Returns None if the method is not a known built-in (caller should error).
pub fn dispatch_method(
receiver: &Value, method: &str, args: &[Value],
regex: Option<&Regex>, span: Span, ctx: &EvalContext<'_>,
) -> Result<Option<Value>, EvalError> {
// ...null propagation first...
match method {
"upper" => Ok(Some(string_op(receiver, span, |s| {
Value::String(s.to_uppercase().into())
}))),
"lower" => Ok(Some(string_op(receiver, span, |s| { /* ... */ }))),
// ...one arm per builtin...
_ => Ok(None), // unknown method → caller raises an error
}
}

So "upper" appears twice, in two files: once as a BuiltinDef in builtins.rs (its return type is String), and once as a match arm in builtins_impl.rs (it uppercases). The two are linked only by the string "upper" — there is no shared enum, no trait, nothing the compiler checks.

Because the two tables are independent, you can update one and forget the other, and the program still compiles. The failure shows up at runtime or as a missing type:

  • Signature only, no eval arm: typecheck is happy (it found the return type), but evaluation hits the _ => Ok(None) fall-through and the caller raises “unknown method.”
  • Eval arm only, no signature: evaluation works, but typecheck never finds a BuiltinDef, so the call’s type silently falls back to Any (for scalar methods, clinker’s typechecker reads only return_type from the registry and does not even enforce arg types — so a missing signature degrades quietly rather than erroring).

Here is that two-table coupling as a runnable toy. The signature table and the impl match are keyed by the same string; comment out one half and watch a method become half-defined:

rust // editable

The lesson the toy makes concrete: adding a builtin is a two-file change, and the checklist for “did I really add it?” is human discipline plus a test — not the compiler.

The change-set, and the test that proves it

Section titled “The change-set, and the test that proves it”

To add a scalar method end-to-end you edit two files:

  1. crates/cxl/src/builtins.rs — add one entry to BuiltinRegistry::new() (pick the right category helper / array) so typecheck knows the receiver, args, and return type.
  2. crates/cxl/src/eval/builtins_impl.rs — add one match arm to dispatch_method returning Ok(Some(value)), reusing helpers like string_op.

(A closure-bearing method like map/filter, or a window function, needs extra wiring in eval/compiled.rs and the window_fns table — out of scope here; the scalar case is the clean first contribution.) Then prove it through the full stack — a test that parses, resolves, typechecks, and evaluates a program that calls the method:

cxl ·tests.rs ·string_methods test @47d2e12
#[test]
fn string_methods() {
// drives whole programs like emit out = s.upper() through
// parse -> resolve -> typecheck -> compile -> eval, asserting
// s.upper() on "abc" yields Value::String("ABC"), etc.
}

// quick check

You add a BuiltinDef for a new method `slugify` to BuiltinRegistry::new() but forget to add an arm to dispatch_method. What happens when a pipeline calls s.slugify()?

You’ve extended the expression language. Next: extend the engine’s edges — add a whole new file format behind the reader/writer seam.