Bounded memory in action
Last lesson built the arbitrator’s shared, interior-mutable state. This one is about the decisions it makes with that state: when memory gets tight, does it spill a buffer, pause a source, or abort the run? And — the part that saves you from a wedged job — how does it refuse a budget it can never satisfy, before a single record moves? The answer threads back through two earlier ideas: a swappable strategy (where a trait object is finally the right call) and the fail-fast error taxonomy from Phase 3.
You’ll be able to: explain the spill/pause/abort decision, recognise why the policy is a trait object here (when operators were a closed enum), and explain the unsatisfiable-budget rejection.
The decision is a set of gates, not one verdict
Section titled “The decision is a set of gates, not one verdict”There’s no single enum { Spill, Pause, Abort }. The arbitrator exposes a few boolean gates that
operators poll. The central one is should_spill, checked at every bulk admission:
clinker-exec ·memory.rs ·should_spill fn @47d2e12
pub fn should_spill(&self) -> bool { self.observe(); // refresh peak RSS let soft = self.soft_limit(); let tripped = self.peak_rss.load(Ordering::Relaxed) > soft || self.sum_consumer_usage() > soft; // RSS OR summed charged bytes if tripped { self.poll_arbitration(); // run ONE arbitration round } tripped}Two things worth noting. It trips on either real process RSS or the summed charged bytes —
so it still works on a platform where RSS can’t be read (the charged-byte sum is the backstop). And
when it trips, it runs one arbitration round (poll_arbitration) that actually does something
about the pressure. Its sibling should_abort checks the hard limit and is the line that turns
runaway memory into a fatal error.
The policy is a trait object — and that’s correct here
Section titled “The policy is a trait object — and that’s correct here”Inside the arbitration round, which operator gets acted on is a strategy — and clinker makes it swappable through a trait:
clinker-exec ·memory.rs ·ArbitrationPolicy trait @47d2e12
pub trait ArbitrationPolicy: Send + Sync { fn select_victim(&self, consumers: &[(ConsumerId, &dyn MemoryConsumer)], pressure_bytes: u64) -> Option<ConsumerId>;}// concrete policies: LargestFirst, Priority, BackPressurePreferred, NoOpPolicyThe arbitrator holds its policy as Box<dyn ArbitrationPolicy> and the round asks it for a victim,
then pauses or spills that operator:
let victim = self.policy.select_victim(&snapshot, pressure);if let Some(id) = victim { if consumer.can_back_pressure() { consumer.pause(); } // a source: park it else { consumer.try_spill(pressure); } // an operator: spill it}Stop and compare with lesson 4.1. There, operators were a closed enum with no dyn, because
the set is engine-owned and fixed. Here the memory policy is a trait object, because it’s a
strategy chosen at run time from config — Spill, Pause, or Both map to different policy
implementations, and a deployment picks one. That’s exactly the open, runtime-chosen seam where
dyn pays off (just like the IO seam in 3.1). Same engine, same author, opposite tools — chosen by
whether the set is closed-and-owned or open-and-configured. Recognising which situation you’re in
is the whole skill.
memory pressure ──▶ should_spill() trips ──▶ poll_arbitration() │ policy.select_victim() │ ┌───────────────┴───────────────┐ can_back_pressure? otherwise │ │ pause the source spill the operatorReject the impossible budget up front
Section titled “Reject the impossible budget up front”Here’s the failure the design most wants to prevent. Suppose a config sets a memory limit below the process’s baseline RSS — the memory the engine occupies before reading any data. Under a producer-pausing policy, the arbitrator would pause everything to get under budget and then never be able to resume: a deadlock. So clinker checks for it before the run starts and refuses:
clinker-exec ·memory.rs ·reject_unsatisfiable_budget fn @47d2e12
pub fn reject_unsatisfiable_budget(limit: u64, knob: BackpressureKnob) -> Result<(), PipelineError> { if !knob.pauses_producers() { return Ok(()); } // only pausing policies can deadlock let Some(baseline_rss) = rss_bytes() else { return Ok(()); }; if limit < baseline_rss { return Err(PipelineError::UnsatisfiableMemoryBudget { limit, baseline_rss }); } Ok(())}That PipelineError::UnsatisfiableMemoryBudget (error code E312) is one of the always-abort
variants from lesson 3.6 — there’s no DLQ for “your budget is impossible,” it stops the run before
any thread spawns. This is the bounded-memory contract’s front door: fail loudly at startup, never
wedge mid-run. A guarantee is most valuable when it’s checked before you’ve spent an hour on the job.
Pick a victim, two ways
Section titled “Pick a victim, two ways”The policy is just a strategy object. Two policies, the same pressure, different victims — selected at run time through a trait object:
> output appears here — press Run
Swap the boxed policy and the victim changes with no other code touched — that’s the value of a
runtime-chosen strategy behind a trait. The real ArbitrationPolicy is this exact shape, just with
LargestFirst / Priority / BackPressurePreferred and real operators.
// quick check
Operators are dispatched through a closed enum (no dyn), but the memory ArbitrationPolicy is a Box<dyn ArbitrationPolicy>. Why the different choice?
The deciding question is closed-and-owned vs open-and-configured. Node kinds are closed, so a closed enum gives compile-time exhaustiveness. The arbitration policy is picked at run time from config, so a trait object lets the strategy vary without the engine naming each one inline.
Inspect the decision
Section titled “Inspect the decision”You’ve now seen the whole bounded-memory machine: buffers that spill, an arbitrator that decides,
a policy that picks. The next lesson drops to the lowest level the data layer reaches — the
unsafe code behind the string type every record field uses, and the invariants that keep it sound.