kossisoroyce
diff --git a/‎CHANGELOG.md‎
Lines changed: 61 additions & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 61 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 23 additions & 17 deletions b/‎README.md‎
Lines changed: 23 additions & 17 deletions
diff --git a/‎docs/advanced.md‎
Lines changed: 157 additions & 14 deletions b/‎docs/advanced.md‎
Lines changed: 157 additions & 14 deletions
@@ -11,6 +11,63 @@ Versioning: [Semantic Versioning](https://semver.org/)
 
 ---
 
+## [0.4.0] — 2026-03-04
+
+### Added
+
+- **ONNX Linear/SVM/Normalizer/Scaler frontend** — `parse_onnx_model()` now handles six additional ONNX ML opset operators beyond `TreeEnsemble`:
+  - `LinearClassifier` — binary (sigmoid) and multiclass (softmax), with correct per-row weight extraction
+  - `LinearRegressor` — identity activation, arbitrary output dimensionality
+  - `SVMClassifier` — RBF and linear kernels, full support-vector matrix extraction
+  - `SVMRegressor` — same kernel support as classifier
+  - `Normalizer` — L1 / L2 / Max normalization as a `NormalizerStage` preprocessing step
+  - `Scaler` — mean-shift / scale as a `ScalerStage` preprocessing step; fuses with downstream trees via pipeline fusion
+- **`LinearStage` IR node** — new `PipelineStage` subclass holding weights, biases, activation (`none` / `sigmoid` / `softmax`), `n_classes`, and `multi_weights` flag; full JSON serialization round-trip
+- **`SVMStage` IR node** — new `PipelineStage` subclass holding support-vector matrix, dual coefficients, rho, gamma, coef0, degree, and kernel type; full JSON serialization round-trip
+- **`NormalizerStage` IR node** — new `PipelineStage` subclass; full JSON serialization round-trip
+- **C99 emitter: Linear and SVM backends** — `C99Emitter.emit()` now dispatches `LinearStage` and `SVMStage` to dedicated emitters:
+  - `_emit_inference_linear` — unrolled dot product, sigmoid (binary), softmax (multiclass), or identity (regression)
+  - `_emit_inference_svm` — RBF kernel (`exp(-γ·‖x−sv‖²)`) or linear kernel, with `tanh` post-transform
+  - All outputs are bounded to `n_outputs` to prevent any buffer overflow
+- **Embedded deployment profiles** — `TargetSpec.for_embedded(profile)` selects cross-compilation toolchains for four targets; the Makefile emitted by `C99Emitter` switches automatically:
+  - `cortex-m4` — `arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard`
+  - `cortex-m33` — `arm-none-eabi-gcc -mcpu=cortex-m33 -mfpu=fpv5-sp-d16 -mfloat-abi=hard`
+  - `rv32imf` — `riscv32-unknown-elf-gcc -march=rv32imf -mabi=ilp32f`
+  - `rv64gc` — `riscv64-unknown-elf-gcc -march=rv64gc -mabi=lp64d`
+  - No `-fPIC` or `-shared` flags on embedded targets; produces `.a` static libraries instead of `.so`
+- **LLVM IR backend** (`timber/codegen/llvm_ir.py`) — new `LLVMIREmitter` supporting `TreeEnsembleStage`, `LinearStage`, and `SVMStage`; configurable target triple (`x86_64`, `aarch64`, `cortex-m4`, …); produces `model.ll` with SSA form, named `traverse_tree_N` per-tree functions, and the `timber_infer_single` entry point
+- **Differential privacy module** (`timber/privacy/dp.py`) — `apply_dp_noise(outputs, cfg)` injects calibrated noise into inference outputs; features:
+  - Laplace mechanism: scale = `sensitivity / epsilon`
+  - Gaussian mechanism: σ = `√(2 ln(1.25/δ)) · sensitivity / epsilon`
+  - `DPConfig` — validates `epsilon > 0`, `sensitivity > 0`, `delta ∈ (0,1)` for Gaussian, mechanism name
+  - `DPReport` — returns `noise_scale`, `mechanism`, `n_outputs_noised`, `epsilon`, `delta`
+  - `calibrate_epsilon(noise_level, sensitivity, mechanism)` — invert the mechanism to find required ε
+  - Input dtype preserved (float32/float64 round-trips exactly); optional output clipping via `clip_outputs`, `output_min`, `output_max`
+  - Deterministic replay with `seed` parameter
+- **`bench` command enhancements** — richer reporting beyond latency:
+  - `--iters N` flag for total timed iterations (default: 1 000)
+  - P50 / P95 / P99 / P999 latency percentiles
+  - Coefficient of variation (CV%) as a stability indicator
+  - `--report PATH` writes a structured JSON report *and* a self-contained HTML file (no external dependencies) with a sortable results table and system-info block
+  - `_bench_report_html()` helper for programmatic HTML generation
+- **Nuclear-grade test suite** (`tests/test_nuclear.py`) — 139 new tests (436 total passing) covering: IR layer, sklearn/ONNX parsers, numeric accuracy (C99 vs Python IR), all optimizer passes + idempotency + pipeline fusion math verification, diff compiler, C99/WASM/MISRA-C/LLVM IR emitters, differential privacy statistical correctness, and full end-to-end pipelines
+
+### Fixed
+
+- **ONNX `classlabels_ints` attribute name** — parser was reading `classlabels_int64s` (wrong); multiclass models always reported `n_classes = 2`, producing incorrect weight slicing and garbage softmax outputs
+- **Binary ONNX `LinearClassifier` double weight row** — `skl2onnx` emits both class rows for binary models; parser now extracts only the positive-class row and sets `multi_weights = False`, fixing incorrect weight counts and index misalignment
+- **C99 buffer overflow guard** — `multi_weights = True` softmax loop now bounded by `n_outputs` (not `n_classes`), preventing out-of-bounds writes when the output buffer is smaller than the number of internal score slots
+
+### Changed
+
+- ONNX supported-operator list expanded from `TreeEnsemble{Classifier,Regressor}` to include `LinearClassifier`, `LinearRegressor`, `SVMClassifier`, `SVMRegressor`, `Normalizer`, `ZipMap`, `Scaler`
+- `C99Emitter.emit()` dispatch table extended; unknown primary stage now raises `ValueError("No supported primary stage")`
+- `pyproject.toml` development status upgraded from `3 - Alpha` to `4 - Beta`
+- `[project.optional-dependencies]` gains `privacy = ["numpy>=1.24"]` and `full` gains `onnx>=1.14`, `skl2onnx>=1.15`
+- Test count: 297 → 436
+
+---
+
 ## [0.3.0] — 2026-03-04
 
 ### Added
@@ -99,5 +156,8 @@ Versioning: [Semantic Versioning](https://semver.org/)
 - `bench --warmup-iters` value was silently capped at 100; now uses the full user-supplied value
 - `timber list` model names printed before table (cosmetic ordering); names now appear after
 
-[Unreleased]: https://github.com/kossisoroyce/timber/compare/v0.1.0...HEAD
+[Unreleased]: https://github.com/kossisoroyce/timber/compare/v0.4.0...HEAD
+[0.4.0]: https://github.com/kossisoroyce/timber/compare/v0.3.0...v0.4.0
+[0.3.0]: https://github.com/kossisoroyce/timber/compare/v0.2.0...v0.3.0
+[0.2.0]: https://github.com/kossisoroyce/timber/compare/v0.1.0...v0.2.0
 [0.1.0]: https://github.com/kossisoroyce/timber/releases/tag/v0.1.0
@@ -23,7 +23,7 @@
 
 ---
 
-Timber takes a trained tree-based model — XGBoost, LightGBM, scikit-learn, CatBoost, or ONNX — runs it through a multi-pass optimizing compiler, and emits a **self-contained C99 inference binary** with zero runtime dependencies. A built-in HTTP server (Ollama-compatible API) lets you serve any model — local file or remote URL — in one command.
+Timber takes a trained ML model — XGBoost, LightGBM, scikit-learn, CatBoost, or ONNX (tree ensembles, linear models, SVMs) — runs it through a multi-pass optimizing compiler, and emits a **self-contained C99 inference artifact** with zero runtime dependencies. A built-in HTTP server (Ollama-compatible API) lets you serve any model — local file or remote URL — in one command.
 
 > **~2 µs single-sample inference · ~336× faster than Python XGBoost · ~48 KB artifact · zero runtime dependencies**
 
@@ -178,10 +178,10 @@ curl -s http://localhost:11434/api/predict \
 
 | Framework | File format | Notes |
 |-----------|-------------|-------|
-| XGBoost | `.json` | All objectives; multiclass, binary, regression |
+| XGBoost | `.json` | All objectives; multiclass, binary, regression; XGBoost 3.1+ per-class base_score |
 | LightGBM | `.txt`, `.model`, `.lgb` | All objectives including multiclass |
 | scikit-learn | `.pkl`, `.pickle` | GradientBoostingClassifier/Regressor, RandomForest, ExtraTrees, DecisionTree, Pipeline |
-| ONNX | `.onnx` | `TreeEnsembleClassifier` and `TreeEnsembleRegressor` ML operators |
+| ONNX | `.onnx` | `TreeEnsembleClassifier/Regressor`, `LinearClassifier/Regressor`, `SVMClassifier/Regressor`, `Normalizer`, `Scaler` |
 | CatBoost | `.json` | JSON export (`save_model(..., format='json')`) |
 
 ---
@@ -218,10 +218,12 @@ See [`benchmarks/`](benchmarks/) for full methodology, hardware capture script,
 | **Latency** | ~2 µs | 100s of µs–ms | ~100 µs | ~10–30 µs | ~50 µs |
 | **Runtime deps** | None | Python + framework | ONNX Runtime libs | Treelite runtime | Python + LightGBM |
 | **Artifact size** | ~48 KB | 50–200+ MB process | MBs | MB-scale | Python env |
-| **Formats** | 5 | Each framework only | ONNX only | GBDTs | LightGBM only |
+| **Formats** | 5 (trees + linear + SVM) | Each framework only | ONNX only | GBDTs | LightGBM only |
 | **C export** | Yes (C99) | No | No | Yes | No |
-| **Edge / embedded** | Yes | No | Partial | Partial | No |
-| **Audit / MISRA** | Roadmap | No | No | No | No |
+| **LLVM IR export** | Yes | No | No | No | No |
+| **Edge / embedded** | Yes (Cortex-M4/M33, RISC-V) | No | Partial | Partial | No |
+| **MISRA-C output** | Yes | No | No | No | No |
+| **Differential privacy** | Yes | No | No | No | No |
 
 ---
 
@@ -284,12 +286,13 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
 
 ## Limitations
 
-- **ONNX** — currently supports `TreeEnsembleClassifier` / `TreeEnsembleRegressor` operators only
+- **ONNX** — supports `TreeEnsemble`, `LinearClassifier/Regressor`, `SVMClassifier/Regressor`, `Normalizer`, `Scaler`; other operators (e.g., neural network layers) are not yet supported
 - **CatBoost** — requires JSON export (`save_model(..., format='json')`); native binary format not supported
 - **scikit-learn** — major estimators and `Pipeline` wrappers are supported; uncommon custom estimators may require a custom front-end
 - **Pickle** — follow standard pickle security hygiene; only load artifacts from trusted sources
 - **XGBoost** — JSON model format is the primary path; binary booster format is not supported
-- **MISRA-C / safety certification** — deterministic output is guaranteed but formal MISRA-C compliance is on the roadmap, not yet certified
+- **LLVM IR** — currently emitted as text (`.ll`); requires a local LLVM/Clang installation to produce native code from it
+- **MISRA-C** — the built-in compliance checker covers the rules most relevant to generated code; it is not a substitute for a certified static analysis tool
 
 ---
 
@@ -300,15 +303,18 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
 | ✅ | XGBoost, LightGBM, scikit-learn, CatBoost, ONNX front-ends |
 | ✅ | Multi-pass IR optimizer (dead-leaf, quantization, branch sort, scaler fusion) |
 | ✅ | C99 emitter with WebAssembly target |
-| ✅ | Ollama-compatible HTTP inference server |
+| ✅ | Ollama-compatible HTTP inference server with multi-worker FastAPI |
 | ✅ | PyPI packaging with OIDC trusted publishing |
+| ✅ | ONNX Linear/SVM/Normalizer/Scaler operator support |
+| ✅ | ARM Cortex-M4/M33 and RISC-V rv32imf/rv64gc embedded deployment profiles |
+| ✅ | MISRA-C:2012 compliant output mode with built-in compliance checker |
+| ✅ | LLVM IR backend with configurable target triples |
+| ✅ | Differential privacy (Laplace + Gaussian) inference mode |
+| ✅ | Richer `bench` reports: P50/P95/P99/P999, CV%, JSON + HTML output |
 | 🔄 | Remote model registry (`timber pull` from hosted model library) |
-| 🔄 | Broader ONNX operator support (linear, SVM, normalizers) |
-| 🔄 | ARM Cortex-M / RISC-V embedded deployment profiles |
-| 🔄 | MISRA-C compliant output mode for automotive/aerospace |
-| 🔄 | Richer benchmark matrices and public reproducibility report |
-| 🔲 | LLVM IR target for hardware-specific optimization |
-| 🔲 | Differential privacy inference mode |
+| 🔲 | Neural network operator support (MLPClassifier) |
+| 🔲 | ONNX export path (Timber IR → ONNX) |
+| 🔲 | Rust backend emitter |
 
 ---
 
@@ -318,11 +324,11 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
 git clone https://github.com/kossisoroyce/timber.git
 cd timber
 pip install -e ".[dev]"
-pytest tests/ -v                    # 146 tests
+pytest tests/ -v                    # 436 tests
 ruff check timber/                  # linting
 ```
 
-The test suite covers parsers, IR, optimizer passes, C99 emission, WebAssembly emission, numerical accuracy (± 1e-4), and end-to-end compilation for all supported frameworks.
+The test suite covers: parsers (sklearn, ONNX, XGBoost, LightGBM, CatBoost), IR layer (serialization, deep_copy, all stage types), optimizer passes (correctness, idempotency, pipeline fusion math), C99/WASM/MISRA-C/LLVM IR emitters (compile + numeric accuracy), differential privacy (statistical correctness, all dtypes), and full end-to-end pipelines.
 
 See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full development guide.
 
 
@@ -100,27 +100,172 @@ Usage in the browser:
 </script>
 ```
 
+## LLVM IR Backend
+
+Emit LLVM IR (`.ll`) for hardware-specific optimization or integration with existing LLVM toolchains:
+
+```python
+from timber.codegen.llvm_ir import LLVMIREmitter
+from timber.frontends.auto_detect import parse_model
+
+ir = parse_model("model.json")
+
+# Emit for the host architecture
+emitter = LLVMIREmitter(target="x86_64")
+out = emitter.emit(ir)
+print(out.model_ll[:500])  # SSA-form LLVM IR text
+
+# Save to disk
+files = out.save("./dist/")
+# files["model.ll"] — LLVM IR text file
+```
+
+Supported target triples:
+
+| Alias | Triple emitted |
+|-------|---------------|
+| `x86_64` | `x86_64-unknown-linux-gnu` |
+| `aarch64` | `aarch64-unknown-linux-gnu` |
+| `cortex-m4` | `thumbv7em-none-eabi` |
+| `cortex-m33` | `thumbv8m.main-none-eabi` |
+| `rv32imf` | `riscv32-unknown-elf` |
+| `rv64gc` | `riscv64-unknown-elf` |
+
+Compile to native code with LLVM:
+
+```bash
+llc -filetype=obj model.ll -o model.o
+clang model.o -shared -o model.so -lm
+```
+
+---
+
+## Embedded Cross-Compilation
+
+Target ARM Cortex-M and RISC-V microcontrollers with the built-in embedded profiles:
+
+```python
+from timber.codegen.c99 import C99Emitter, TargetSpec
+from timber.frontends.auto_detect import parse_model
+
+ir = parse_model("model.json")
+
+# Select an embedded profile
+spec = TargetSpec.for_embedded("cortex-m4")
+out = C99Emitter(spec).emit(ir)
+out.write("./dist/")
+```
+
+The emitted `Makefile` automatically uses the correct cross-compiler:
+
+| Profile | Toolchain | Flags |
+|---------|-----------|-------|
+| `cortex-m4` | `arm-none-eabi-gcc` | `-mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard` |
+| `cortex-m33` | `arm-none-eabi-gcc` | `-mcpu=cortex-m33 -mfpu=fpv5-sp-d16 -mfloat-abi=hard` |
+| `rv32imf` | `riscv32-unknown-elf-gcc` | `-march=rv32imf -mabi=ilp32f` |
+| `rv64gc` | `riscv64-unknown-elf-gcc` | `-march=rv64gc -mabi=lp64d` |
+
+Embedded builds produce a static `.a` library (no `-fPIC`, no `-shared`) suitable for bare-metal linking.
+
+```bash
+cd dist/
+make            # invokes arm-none-eabi-gcc, produces libtimber_model.a
+```
+
+---
+
 ## MISRA-C Compliance Mode
 
 For safety-critical deployments (automotive, medical, avionics):
 
 ```python
-from timber.codegen.misra_c import MisraCEmitter, check_misra_compliance
+from timber.codegen.misra_c import MisraCEmitter
 from timber.frontends.auto_detect import parse_model
 
 ir = parse_model("model.json")
 
-# Generate MISRA-C compliant code
-emitter = MisraCEmitter(ir)
-files = emitter.emit()
+# Generate MISRA-C:2012 compliant code
+emitter = MisraCEmitter()
+out = emitter.emit(ir)
 
-# Validate compliance
-report = check_misra_compliance(files)
-print(f"Violations: {report.violations}")
-print(f"Warnings: {report.warnings}")
+# Check compliance (returns ComplianceReport)
+report = emitter.check_compliance(out.model_c)
 print(f"Compliant: {report.is_compliant}")
+print(f"Violations: {len(report.violations)}")
+print(f"Rules checked: {report.rules_checked}")
+for v in report.violation_objects:
+    print(f"  [{v.severity}] Rule {v.rule}: {v.description}")
+
+# Write to disk and compile normally
+out.write("./dist/")
 ```
 
+Rules checked by the built-in verifier:
+
+| Rule | Description |
+|------|-------------|
+| 1.1 | No compiler extensions (`__attribute__`, `__declspec`) |
+| 7.1 | No octal integer literals |
+| 14.4 | No VLAs |
+| 20.4 | No `#undef` |
+| 20.9 | No `<stdio.h>` include |
+| 21.1 | No reserved identifier redefinition |
+| 21.6 | No `printf`/`scanf` |
+| 22.x | All variables initialized at declaration |
+
+## Differential Privacy
+
+Add calibrated noise to model outputs for privacy-preserving inference:
+
+```python
+from timber.privacy.dp import DPConfig, apply_dp_noise, calibrate_epsilon
+import numpy as np
+
+# Configure the mechanism
+cfg = DPConfig(
+    mechanism="laplace",   # or "gaussian"
+    epsilon=1.0,           # privacy budget
+    sensitivity=1.0,       # L1 sensitivity of your model output
+    clip_outputs=True,
+    output_min=0.0,
+    output_max=1.0,
+)
+
+# Apply noise to raw inference outputs
+raw_outputs = np.array([[0.85, 0.15], [0.32, 0.68]], dtype=np.float32)
+noisy_outputs, report = apply_dp_noise(raw_outputs, cfg)
+
+print(f"Mechanism:    {report.mechanism}")
+print(f"Noise scale:  {report.noise_scale:.4f}")
+print(f"Outputs:      {report.n_outputs_noised}")
+print(report.summary())
+```
+
+**Mechanisms:**
+
+| Mechanism | Noise scale | Best for |
+|-----------|-------------|----------|
+| `laplace` | `sensitivity / epsilon` | Unbounded outputs, `delta = 0` |
+| `gaussian` | `√(2 ln(1.25/δ)) · sensitivity / epsilon` | Bounded outputs, (`ε`, `δ`)-DP |
+
+**Calibrating epsilon** — find the privacy budget needed to limit noise to a target level:
+
+```python
+epsilon = calibrate_epsilon(
+    noise_level=0.05,   # tolerable noise standard deviation
+    sensitivity=1.0,
+    mechanism="laplace",
+)
+print(f"Required epsilon: {epsilon:.3f}")
+```
+
+**Notes:**
+- Input dtype is preserved exactly (float32 in → float32 out; float64 in → float64 out)
+- Pass `seed=42` for deterministic, reproducible noise (useful for testing)
+- Apply *after* C99 / Python inference, *before* returning results to clients
+
+---
+
 ## Differential Compilation
 
 When retraining models incrementally, avoid full recompilation:
@@ -134,13 +279,11 @@ new_ir = parse_model("model_v2.json")
 
 # Compute diff
 diff = diff_models(old_ir, new_ir)
-print(f"Added: {len(diff.added)} trees")
-print(f"Removed: {len(diff.removed)} trees")
-print(f"Modified: {len(diff.modified)} trees")
-print(f"Unchanged: {len(diff.unchanged)} trees")
+print(diff.summary())
+# {"added": 3, "removed": 1, "modified": 2, "unchanged": 44}
 
-# Incremental compile (only recompiles changed trees)
-updated_ir = incremental_compile(old_ir, new_ir)
+# Incremental compile (reuses unchanged trees, annotates IR with diff metadata)
+updated_ir = incremental_compile(old_ir, new_ir, diff)
 ```
 
 ## Ensemble Composition