Skip to content

Commit 67e6a91

Browse files
author
Kossiso Royce
committed
chore: release v0.4.0
- ONNX Linear/SVM/Normalizer/Scaler frontend support - LinearStage, SVMStage, NormalizerStage IR nodes - C99 emitter: Linear and SVM backends - Embedded cross-compilation profiles (Cortex-M4/M33, RISC-V rv32imf/rv64gc) - LLVM IR backend with configurable target triples - Differential privacy module (Laplace + Gaussian) - bench: P50/P95/P99/P999, CV%, --iters, --report JSON+HTML - 3 ONNX parser bug fixes - Nuclear-grade test suite: 436 tests passing - Docs and GitHub Pages fully updated
1 parent 0eddfc2 commit 67e6a91

23 files changed

Lines changed: 5810 additions & 211 deletions

CHANGELOG.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,63 @@ Versioning: [Semantic Versioning](https://semver.org/)
1111

1212
---
1313

14+
## [0.4.0] — 2026-03-04
15+
16+
### Added
17+
18+
- **ONNX Linear/SVM/Normalizer/Scaler frontend**`parse_onnx_model()` now handles six additional ONNX ML opset operators beyond `TreeEnsemble`:
19+
- `LinearClassifier` — binary (sigmoid) and multiclass (softmax), with correct per-row weight extraction
20+
- `LinearRegressor` — identity activation, arbitrary output dimensionality
21+
- `SVMClassifier` — RBF and linear kernels, full support-vector matrix extraction
22+
- `SVMRegressor` — same kernel support as classifier
23+
- `Normalizer` — L1 / L2 / Max normalization as a `NormalizerStage` preprocessing step
24+
- `Scaler` — mean-shift / scale as a `ScalerStage` preprocessing step; fuses with downstream trees via pipeline fusion
25+
- **`LinearStage` IR node** — new `PipelineStage` subclass holding weights, biases, activation (`none` / `sigmoid` / `softmax`), `n_classes`, and `multi_weights` flag; full JSON serialization round-trip
26+
- **`SVMStage` IR node** — new `PipelineStage` subclass holding support-vector matrix, dual coefficients, rho, gamma, coef0, degree, and kernel type; full JSON serialization round-trip
27+
- **`NormalizerStage` IR node** — new `PipelineStage` subclass; full JSON serialization round-trip
28+
- **C99 emitter: Linear and SVM backends**`C99Emitter.emit()` now dispatches `LinearStage` and `SVMStage` to dedicated emitters:
29+
- `_emit_inference_linear` — unrolled dot product, sigmoid (binary), softmax (multiclass), or identity (regression)
30+
- `_emit_inference_svm` — RBF kernel (`exp(-γ·‖x−sv‖²)`) or linear kernel, with `tanh` post-transform
31+
- All outputs are bounded to `n_outputs` to prevent any buffer overflow
32+
- **Embedded deployment profiles**`TargetSpec.for_embedded(profile)` selects cross-compilation toolchains for four targets; the Makefile emitted by `C99Emitter` switches automatically:
33+
- `cortex-m4``arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard`
34+
- `cortex-m33``arm-none-eabi-gcc -mcpu=cortex-m33 -mfpu=fpv5-sp-d16 -mfloat-abi=hard`
35+
- `rv32imf``riscv32-unknown-elf-gcc -march=rv32imf -mabi=ilp32f`
36+
- `rv64gc``riscv64-unknown-elf-gcc -march=rv64gc -mabi=lp64d`
37+
- No `-fPIC` or `-shared` flags on embedded targets; produces `.a` static libraries instead of `.so`
38+
- **LLVM IR backend** (`timber/codegen/llvm_ir.py`) — new `LLVMIREmitter` supporting `TreeEnsembleStage`, `LinearStage`, and `SVMStage`; configurable target triple (`x86_64`, `aarch64`, `cortex-m4`, …); produces `model.ll` with SSA form, named `traverse_tree_N` per-tree functions, and the `timber_infer_single` entry point
39+
- **Differential privacy module** (`timber/privacy/dp.py`) — `apply_dp_noise(outputs, cfg)` injects calibrated noise into inference outputs; features:
40+
- Laplace mechanism: scale = `sensitivity / epsilon`
41+
- Gaussian mechanism: σ = `√(2 ln(1.25/δ)) · sensitivity / epsilon`
42+
- `DPConfig` — validates `epsilon > 0`, `sensitivity > 0`, `delta ∈ (0,1)` for Gaussian, mechanism name
43+
- `DPReport` — returns `noise_scale`, `mechanism`, `n_outputs_noised`, `epsilon`, `delta`
44+
- `calibrate_epsilon(noise_level, sensitivity, mechanism)` — invert the mechanism to find required ε
45+
- Input dtype preserved (float32/float64 round-trips exactly); optional output clipping via `clip_outputs`, `output_min`, `output_max`
46+
- Deterministic replay with `seed` parameter
47+
- **`bench` command enhancements** — richer reporting beyond latency:
48+
- `--iters N` flag for total timed iterations (default: 1 000)
49+
- P50 / P95 / P99 / P999 latency percentiles
50+
- Coefficient of variation (CV%) as a stability indicator
51+
- `--report PATH` writes a structured JSON report *and* a self-contained HTML file (no external dependencies) with a sortable results table and system-info block
52+
- `_bench_report_html()` helper for programmatic HTML generation
53+
- **Nuclear-grade test suite** (`tests/test_nuclear.py`) — 139 new tests (436 total passing) covering: IR layer, sklearn/ONNX parsers, numeric accuracy (C99 vs Python IR), all optimizer passes + idempotency + pipeline fusion math verification, diff compiler, C99/WASM/MISRA-C/LLVM IR emitters, differential privacy statistical correctness, and full end-to-end pipelines
54+
55+
### Fixed
56+
57+
- **ONNX `classlabels_ints` attribute name** — parser was reading `classlabels_int64s` (wrong); multiclass models always reported `n_classes = 2`, producing incorrect weight slicing and garbage softmax outputs
58+
- **Binary ONNX `LinearClassifier` double weight row**`skl2onnx` emits both class rows for binary models; parser now extracts only the positive-class row and sets `multi_weights = False`, fixing incorrect weight counts and index misalignment
59+
- **C99 buffer overflow guard**`multi_weights = True` softmax loop now bounded by `n_outputs` (not `n_classes`), preventing out-of-bounds writes when the output buffer is smaller than the number of internal score slots
60+
61+
### Changed
62+
63+
- ONNX supported-operator list expanded from `TreeEnsemble{Classifier,Regressor}` to include `LinearClassifier`, `LinearRegressor`, `SVMClassifier`, `SVMRegressor`, `Normalizer`, `ZipMap`, `Scaler`
64+
- `C99Emitter.emit()` dispatch table extended; unknown primary stage now raises `ValueError("No supported primary stage")`
65+
- `pyproject.toml` development status upgraded from `3 - Alpha` to `4 - Beta`
66+
- `[project.optional-dependencies]` gains `privacy = ["numpy>=1.24"]` and `full` gains `onnx>=1.14`, `skl2onnx>=1.15`
67+
- Test count: 297 → 436
68+
69+
---
70+
1471
## [0.3.0] — 2026-03-04
1572

1673
### Added
@@ -99,5 +156,8 @@ Versioning: [Semantic Versioning](https://semver.org/)
99156
- `bench --warmup-iters` value was silently capped at 100; now uses the full user-supplied value
100157
- `timber list` model names printed before table (cosmetic ordering); names now appear after
101158

102-
[Unreleased]: https://github.com/kossisoroyce/timber/compare/v0.1.0...HEAD
159+
[Unreleased]: https://github.com/kossisoroyce/timber/compare/v0.4.0...HEAD
160+
[0.4.0]: https://github.com/kossisoroyce/timber/compare/v0.3.0...v0.4.0
161+
[0.3.0]: https://github.com/kossisoroyce/timber/compare/v0.2.0...v0.3.0
162+
[0.2.0]: https://github.com/kossisoroyce/timber/compare/v0.1.0...v0.2.0
103163
[0.1.0]: https://github.com/kossisoroyce/timber/releases/tag/v0.1.0

README.md

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323

2424
---
2525

26-
Timber takes a trained tree-based model — XGBoost, LightGBM, scikit-learn, CatBoost, or ONNX — runs it through a multi-pass optimizing compiler, and emits a **self-contained C99 inference binary** with zero runtime dependencies. A built-in HTTP server (Ollama-compatible API) lets you serve any model — local file or remote URL — in one command.
26+
Timber takes a trained ML model — XGBoost, LightGBM, scikit-learn, CatBoost, or ONNX (tree ensembles, linear models, SVMs) — runs it through a multi-pass optimizing compiler, and emits a **self-contained C99 inference artifact** with zero runtime dependencies. A built-in HTTP server (Ollama-compatible API) lets you serve any model — local file or remote URL — in one command.
2727

2828
> **~2 µs single-sample inference · ~336× faster than Python XGBoost · ~48 KB artifact · zero runtime dependencies**
2929
@@ -178,10 +178,10 @@ curl -s http://localhost:11434/api/predict \
178178

179179
| Framework | File format | Notes |
180180
|-----------|-------------|-------|
181-
| XGBoost | `.json` | All objectives; multiclass, binary, regression |
181+
| XGBoost | `.json` | All objectives; multiclass, binary, regression; XGBoost 3.1+ per-class base_score |
182182
| LightGBM | `.txt`, `.model`, `.lgb` | All objectives including multiclass |
183183
| scikit-learn | `.pkl`, `.pickle` | GradientBoostingClassifier/Regressor, RandomForest, ExtraTrees, DecisionTree, Pipeline |
184-
| ONNX | `.onnx` | `TreeEnsembleClassifier` and `TreeEnsembleRegressor` ML operators |
184+
| ONNX | `.onnx` | `TreeEnsembleClassifier/Regressor`, `LinearClassifier/Regressor`, `SVMClassifier/Regressor`, `Normalizer`, `Scaler` |
185185
| CatBoost | `.json` | JSON export (`save_model(..., format='json')`) |
186186

187187
---
@@ -218,10 +218,12 @@ See [`benchmarks/`](benchmarks/) for full methodology, hardware capture script,
218218
| **Latency** | ~2 µs | 100s of µs–ms | ~100 µs | ~10–30 µs | ~50 µs |
219219
| **Runtime deps** | None | Python + framework | ONNX Runtime libs | Treelite runtime | Python + LightGBM |
220220
| **Artifact size** | ~48 KB | 50–200+ MB process | MBs | MB-scale | Python env |
221-
| **Formats** | 5 | Each framework only | ONNX only | GBDTs | LightGBM only |
221+
| **Formats** | 5 (trees + linear + SVM) | Each framework only | ONNX only | GBDTs | LightGBM only |
222222
| **C export** | Yes (C99) | No | No | Yes | No |
223-
| **Edge / embedded** | Yes | No | Partial | Partial | No |
224-
| **Audit / MISRA** | Roadmap | No | No | No | No |
223+
| **LLVM IR export** | Yes | No | No | No | No |
224+
| **Edge / embedded** | Yes (Cortex-M4/M33, RISC-V) | No | Partial | Partial | No |
225+
| **MISRA-C output** | Yes | No | No | No | No |
226+
| **Differential privacy** | Yes | No | No | No | No |
225227

226228
---
227229

@@ -284,12 +286,13 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
284286

285287
## Limitations
286288

287-
- **ONNX**currently supports `TreeEnsembleClassifier` / `TreeEnsembleRegressor` operators only
289+
- **ONNX** — supports `TreeEnsemble`, `LinearClassifier/Regressor`, `SVMClassifier/Regressor`, `Normalizer`, `Scaler`; other operators (e.g., neural network layers) are not yet supported
288290
- **CatBoost** — requires JSON export (`save_model(..., format='json')`); native binary format not supported
289291
- **scikit-learn** — major estimators and `Pipeline` wrappers are supported; uncommon custom estimators may require a custom front-end
290292
- **Pickle** — follow standard pickle security hygiene; only load artifacts from trusted sources
291293
- **XGBoost** — JSON model format is the primary path; binary booster format is not supported
292-
- **MISRA-C / safety certification** — deterministic output is guaranteed but formal MISRA-C compliance is on the roadmap, not yet certified
294+
- **LLVM IR** — currently emitted as text (`.ll`); requires a local LLVM/Clang installation to produce native code from it
295+
- **MISRA-C** — the built-in compliance checker covers the rules most relevant to generated code; it is not a substitute for a certified static analysis tool
293296

294297
---
295298

@@ -300,15 +303,18 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
300303
|| XGBoost, LightGBM, scikit-learn, CatBoost, ONNX front-ends |
301304
|| Multi-pass IR optimizer (dead-leaf, quantization, branch sort, scaler fusion) |
302305
|| C99 emitter with WebAssembly target |
303-
|| Ollama-compatible HTTP inference server |
306+
|| Ollama-compatible HTTP inference server with multi-worker FastAPI |
304307
|| PyPI packaging with OIDC trusted publishing |
308+
|| ONNX Linear/SVM/Normalizer/Scaler operator support |
309+
|| ARM Cortex-M4/M33 and RISC-V rv32imf/rv64gc embedded deployment profiles |
310+
|| MISRA-C:2012 compliant output mode with built-in compliance checker |
311+
|| LLVM IR backend with configurable target triples |
312+
|| Differential privacy (Laplace + Gaussian) inference mode |
313+
|| Richer `bench` reports: P50/P95/P99/P999, CV%, JSON + HTML output |
305314
| 🔄 | Remote model registry (`timber pull` from hosted model library) |
306-
| 🔄 | Broader ONNX operator support (linear, SVM, normalizers) |
307-
| 🔄 | ARM Cortex-M / RISC-V embedded deployment profiles |
308-
| 🔄 | MISRA-C compliant output mode for automotive/aerospace |
309-
| 🔄 | Richer benchmark matrices and public reproducibility report |
310-
| 🔲 | LLVM IR target for hardware-specific optimization |
311-
| 🔲 | Differential privacy inference mode |
315+
| 🔲 | Neural network operator support (MLPClassifier) |
316+
| 🔲 | ONNX export path (Timber IR → ONNX) |
317+
| 🔲 | Rust backend emitter |
312318

313319
---
314320

@@ -318,11 +324,11 @@ Each script trains a model, saves it, runs `timber load`, and validates predicti
318324
git clone https://github.com/kossisoroyce/timber.git
319325
cd timber
320326
pip install -e ".[dev]"
321-
pytest tests/ -v # 146 tests
327+
pytest tests/ -v # 436 tests
322328
ruff check timber/ # linting
323329
```
324330

325-
The test suite covers parsers, IR, optimizer passes, C99 emission, WebAssembly emission, numerical accuracy (± 1e-4), and end-to-end compilation for all supported frameworks.
331+
The test suite covers: parsers (sklearn, ONNX, XGBoost, LightGBM, CatBoost), IR layer (serialization, deep_copy, all stage types), optimizer passes (correctness, idempotency, pipeline fusion math), C99/WASM/MISRA-C/LLVM IR emitters (compile + numeric accuracy), differential privacy (statistical correctness, all dtypes), and full end-to-end pipelines.
326332

327333
See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full development guide.
328334

docs/advanced.md

Lines changed: 157 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -100,27 +100,172 @@ Usage in the browser:
100100
</script>
101101
```
102102

103+
## LLVM IR Backend
104+
105+
Emit LLVM IR (`.ll`) for hardware-specific optimization or integration with existing LLVM toolchains:
106+
107+
```python
108+
from timber.codegen.llvm_ir import LLVMIREmitter
109+
from timber.frontends.auto_detect import parse_model
110+
111+
ir = parse_model("model.json")
112+
113+
# Emit for the host architecture
114+
emitter = LLVMIREmitter(target="x86_64")
115+
out = emitter.emit(ir)
116+
print(out.model_ll[:500]) # SSA-form LLVM IR text
117+
118+
# Save to disk
119+
files = out.save("./dist/")
120+
# files["model.ll"] — LLVM IR text file
121+
```
122+
123+
Supported target triples:
124+
125+
| Alias | Triple emitted |
126+
|-------|---------------|
127+
| `x86_64` | `x86_64-unknown-linux-gnu` |
128+
| `aarch64` | `aarch64-unknown-linux-gnu` |
129+
| `cortex-m4` | `thumbv7em-none-eabi` |
130+
| `cortex-m33` | `thumbv8m.main-none-eabi` |
131+
| `rv32imf` | `riscv32-unknown-elf` |
132+
| `rv64gc` | `riscv64-unknown-elf` |
133+
134+
Compile to native code with LLVM:
135+
136+
```bash
137+
llc -filetype=obj model.ll -o model.o
138+
clang model.o -shared -o model.so -lm
139+
```
140+
141+
---
142+
143+
## Embedded Cross-Compilation
144+
145+
Target ARM Cortex-M and RISC-V microcontrollers with the built-in embedded profiles:
146+
147+
```python
148+
from timber.codegen.c99 import C99Emitter, TargetSpec
149+
from timber.frontends.auto_detect import parse_model
150+
151+
ir = parse_model("model.json")
152+
153+
# Select an embedded profile
154+
spec = TargetSpec.for_embedded("cortex-m4")
155+
out = C99Emitter(spec).emit(ir)
156+
out.write("./dist/")
157+
```
158+
159+
The emitted `Makefile` automatically uses the correct cross-compiler:
160+
161+
| Profile | Toolchain | Flags |
162+
|---------|-----------|-------|
163+
| `cortex-m4` | `arm-none-eabi-gcc` | `-mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard` |
164+
| `cortex-m33` | `arm-none-eabi-gcc` | `-mcpu=cortex-m33 -mfpu=fpv5-sp-d16 -mfloat-abi=hard` |
165+
| `rv32imf` | `riscv32-unknown-elf-gcc` | `-march=rv32imf -mabi=ilp32f` |
166+
| `rv64gc` | `riscv64-unknown-elf-gcc` | `-march=rv64gc -mabi=lp64d` |
167+
168+
Embedded builds produce a static `.a` library (no `-fPIC`, no `-shared`) suitable for bare-metal linking.
169+
170+
```bash
171+
cd dist/
172+
make # invokes arm-none-eabi-gcc, produces libtimber_model.a
173+
```
174+
175+
---
176+
103177
## MISRA-C Compliance Mode
104178

105179
For safety-critical deployments (automotive, medical, avionics):
106180

107181
```python
108-
from timber.codegen.misra_c import MisraCEmitter, check_misra_compliance
182+
from timber.codegen.misra_c import MisraCEmitter
109183
from timber.frontends.auto_detect import parse_model
110184

111185
ir = parse_model("model.json")
112186

113-
# Generate MISRA-C compliant code
114-
emitter = MisraCEmitter(ir)
115-
files = emitter.emit()
187+
# Generate MISRA-C:2012 compliant code
188+
emitter = MisraCEmitter()
189+
out = emitter.emit(ir)
116190

117-
# Validate compliance
118-
report = check_misra_compliance(files)
119-
print(f"Violations: {report.violations}")
120-
print(f"Warnings: {report.warnings}")
191+
# Check compliance (returns ComplianceReport)
192+
report = emitter.check_compliance(out.model_c)
121193
print(f"Compliant: {report.is_compliant}")
194+
print(f"Violations: {len(report.violations)}")
195+
print(f"Rules checked: {report.rules_checked}")
196+
for v in report.violation_objects:
197+
print(f" [{v.severity}] Rule {v.rule}: {v.description}")
198+
199+
# Write to disk and compile normally
200+
out.write("./dist/")
122201
```
123202

203+
Rules checked by the built-in verifier:
204+
205+
| Rule | Description |
206+
|------|-------------|
207+
| 1.1 | No compiler extensions (`__attribute__`, `__declspec`) |
208+
| 7.1 | No octal integer literals |
209+
| 14.4 | No VLAs |
210+
| 20.4 | No `#undef` |
211+
| 20.9 | No `<stdio.h>` include |
212+
| 21.1 | No reserved identifier redefinition |
213+
| 21.6 | No `printf`/`scanf` |
214+
| 22.x | All variables initialized at declaration |
215+
216+
## Differential Privacy
217+
218+
Add calibrated noise to model outputs for privacy-preserving inference:
219+
220+
```python
221+
from timber.privacy.dp import DPConfig, apply_dp_noise, calibrate_epsilon
222+
import numpy as np
223+
224+
# Configure the mechanism
225+
cfg = DPConfig(
226+
mechanism="laplace", # or "gaussian"
227+
epsilon=1.0, # privacy budget
228+
sensitivity=1.0, # L1 sensitivity of your model output
229+
clip_outputs=True,
230+
output_min=0.0,
231+
output_max=1.0,
232+
)
233+
234+
# Apply noise to raw inference outputs
235+
raw_outputs = np.array([[0.85, 0.15], [0.32, 0.68]], dtype=np.float32)
236+
noisy_outputs, report = apply_dp_noise(raw_outputs, cfg)
237+
238+
print(f"Mechanism: {report.mechanism}")
239+
print(f"Noise scale: {report.noise_scale:.4f}")
240+
print(f"Outputs: {report.n_outputs_noised}")
241+
print(report.summary())
242+
```
243+
244+
**Mechanisms:**
245+
246+
| Mechanism | Noise scale | Best for |
247+
|-----------|-------------|----------|
248+
| `laplace` | `sensitivity / epsilon` | Unbounded outputs, `delta = 0` |
249+
| `gaussian` | `√(2 ln(1.25/δ)) · sensitivity / epsilon` | Bounded outputs, (`ε`, `δ`)-DP |
250+
251+
**Calibrating epsilon** — find the privacy budget needed to limit noise to a target level:
252+
253+
```python
254+
epsilon = calibrate_epsilon(
255+
noise_level=0.05, # tolerable noise standard deviation
256+
sensitivity=1.0,
257+
mechanism="laplace",
258+
)
259+
print(f"Required epsilon: {epsilon:.3f}")
260+
```
261+
262+
**Notes:**
263+
- Input dtype is preserved exactly (float32 in → float32 out; float64 in → float64 out)
264+
- Pass `seed=42` for deterministic, reproducible noise (useful for testing)
265+
- Apply *after* C99 / Python inference, *before* returning results to clients
266+
267+
---
268+
124269
## Differential Compilation
125270

126271
When retraining models incrementally, avoid full recompilation:
@@ -134,13 +279,11 @@ new_ir = parse_model("model_v2.json")
134279

135280
# Compute diff
136281
diff = diff_models(old_ir, new_ir)
137-
print(f"Added: {len(diff.added)} trees")
138-
print(f"Removed: {len(diff.removed)} trees")
139-
print(f"Modified: {len(diff.modified)} trees")
140-
print(f"Unchanged: {len(diff.unchanged)} trees")
282+
print(diff.summary())
283+
# {"added": 3, "removed": 1, "modified": 2, "unchanged": 44}
141284

142-
# Incremental compile (only recompiles changed trees)
143-
updated_ir = incremental_compile(old_ir, new_ir)
285+
# Incremental compile (reuses unchanged trees, annotates IR with diff metadata)
286+
updated_ir = incremental_compile(old_ir, new_ir, diff)
144287
```
145288

146289
## Ensemble Composition

0 commit comments

Comments
 (0)