The thesis: agents now write analysis code faster than anyone can review it, so correctness has to move out of soft guidance and into a hard, mechanical gate. The division of labour: physicists review the physics semantics; agents generate the implementation; the Rust compiler and a validation layer reject any inconsistent state — a branch that doesn't exist for the era, a histogram filled before selection, a cross-section added to a luminosity (fb vs fb⁻¹), an incomplete set of systematics. The analysis is modelled as a typed state machine that makes invalid states unrepresentable.
This doesn't reinvent the agent harness — it confines one: the action space is gated by the compiler and the validator, so the orchestrator proposes and the compiler disposes (over a CLI and an MCP server). And the guarantee pays a dividend — a per-event kernel that compiles has no hidden shared state, which is exactly the condition for safe parallelism: if it compiles, it's safe to parallelize. On top sit Rust's systems strengths — performance, FFI to legacy libraries, SIMD per-event execution, and TUI-friendly orchestration.
API docs →
rustdoc for the workspace crates.
nano_spec · nano_analysis · nano_inference · nano_io · nano_core
Notes / blog →
Design notes, screencasts, and essays — including a spec→code walkthrough and the "compiler rejects a data race" demo.
Source →
Code, design docs, and roadmap on GitHub.
What works today
- Validated on real physics — reproduces ROOT's Higgs→ZZ→4ℓ analysis (df103, three channels) on CMS Open Data, read remotely on-demand in pure Rust, and the full stacked discovery plot is bit-identical to ROOT (the dimuon spectrum too). See the post →
- Semantic compiler — a physics spec (TOML/YAML) is statically validated against a NanoAOD catalogue (missing branch, wrong type, missing unit, undefined object are rejected) and generates the typed event loop; codegen is proven equal to the hand-written reference in CI.
- Compile-enforced state machine —
Raw → Baseline → Scored<M> → InRegion<R> → Weighted<R>; wrong-stage fills, score-before-inference, and unit mismatches are compile errors (proven by compile-fail tests). - Pure-Rust ROOT I/O — reads & writes
TTree(zlib/LZMA/lz4/zstd), reads real CMS NanoAODv9 locally and remotely on-demand over HTTPS byte-range; bounded-memory streaming; value-validated against uproot in CI. - Inference protocol — call external ML behind one
Predictortrait: mock, in-process ONNX, a remote endpoint, or a server the framework launches itself; declared in the spec as[[model]]and woven into the typed loop. - Agent action space — the same verified operations exposed over a
nanoCLI and an MCP server, so an orchestrating agent drives validate / derive / inspect / codegen through typed, compiler-gated tools. - Parallelism for free — the verified per-event kernel runs serial or chunk-parallel from one source; ~8× on 16 cores, bit-identical results.
The CMS Higgs→ZZ→4ℓ discovery plot, reconstructed by nano.rust from CMS Open Data and bit-identical to ROOT's df103 — signal + ZZ background stacked, 2012 data overlaid (11.6 fb⁻¹).