A high-performance JavaScript (ECMAScript) engine written in pure Rust, with
no foreign code on the critical path. Kataan is usable three ways — as a Rust
library, as a C library, and as a standalone command-line tool — the same
tri-modal model proven out in the sibling projects
purecrypto (cryptography) and
rsurl (HTTP/curl).
Status: running and broadly conformant; advanced tiers in active build-out. The lexer and the full ECMAScript parser are complete, and two execution engines run real programs and are checked to agree on every test:
- a tree-walking interpreter (the default / corpus engine), and
- a register bytecode VM (the primary path for
kataan runand the C ABI), compiling nearly all of the common language directly — every operator, objects/arrays, method calls withcall/apply/bind,new/new.target, all loops +for-of/for-in/switch/try-catch-finally, closures (incl. mutual recursion), destructuring, rest/spread, classes withextends/superand getters/setters, generators (incl.yield*and.throw()), andasync/await— falling back to the tree-walker for the handful of constructs it doesn't yet compile.A dual-path Test262-style conformance corpus (520/520) passes on both engines, covering closures, classes/inheritance (incl.
extendsof native errors), optional chaining, the iterator protocol,Map/Set/WeakMap,Symbol(incl.Symbol.hasInstance),BigInt,Promise+ async/await,Proxy/Reflect(incl. theownKeystrap drivingObject.keys/values/entries/for-in), typed arrays,Date, an in-houseRegExp, and a large standard library (Math, JSON, Object/Array/String/Number). Compiled bytecode can be serialized, reloaded, and run without the source.Three advanced tiers are real and tested, though each has named work remaining:
- a machine-code JIT (x86-64 / Linux, behind
jit) with an optimizing integer path (four-pass optimizer + register allocator) and a float path covering+ - * / %, comparisons, control flow, and the SSE-expressibleMathintrinsics (sqrt/abs/min/max/floor/ceil/trunc), emitting into W^X memory via raw syscalls; object/string ops stay interpreted;- a pure-Rust,
no_stdWebAssembly engine — full MVP plus sign-extension, saturating conversion, bulk-memory, multi-value, and typed structured control — with a JS↔WASM boundary (validate/compile/instantiate, theModule/Instance/Global/Memoryobjects, host-function imports, and stateful instances), driven by a.wast/WAT spec harness (a spec-derived corpus, not yet the full upstream suite);- a zero-copy "D′" snapshot tier atop the moving GC: a verified codec that
mmap-reloads a heap (eleven reference cell kinds, cross-kind cycles, insertion-order-preserving) and runs a restored closure both in place and reloaded into a fresh runtime.Kataan works as a CLI/REPL, a Rust library, and a C library (
kt_eval). See the roadmap for the remaining road to a complete engine.
Modern JavaScript engines (V8, JavaScriptCore, SpiderMonkey) all rely on the same handful of techniques. Kataan commits to the full set from the architecture stage rather than retrofitting them:
- NaN-boxed values — every JS value in 64 bits,
Copy, dense on the stack. - Hidden classes (shapes) + inline caches — property access becomes a slot load, not a hash probe; the single biggest lever for real-world JS speed.
- Register-based bytecode VM — fewer instructions than a stack VM, and JIT-friendly by construction.
- Interned atoms + rope strings — O(1) key comparison, non-quadratic string building.
- A precise, generational, moving GC — bump allocation makes
newnearly free. - Tiered execution — a fast interpreter first, then a baseline JIT, then an optimizing JIT driven by inline-cache type feedback.
The language core is sans-I/O and no_std + alloc; the host runtime (event
loop, timers, fetch, crypto, modules) is a separate layer on top, so the
engine stays embeddable. See ROADMAP.md for the road ahead — the
remaining work to a complete JS+WASM engine and the design invariants behind it.
Kataan depends on no C libraries. Where it needs cryptography or networking it reuses sibling pure-Rust Karpelès Lab crates:
purecrypto—crypto.subtle/ WebCrypto,crypto.getRandomValues,randomUUID, and TLS.rsurl— HTTP/HTTPS transport behindfetchand the Nodehttp(s)compatibility layer.
unsafe is quarantined: the crate is unsafe_code = "deny" (not forbid),
and only the ffi module plus a small, audited set of VM hot-path primitives
opt back in with a scoped #[allow(unsafe_code)] and a safety comment.
The CLI runs JavaScript today:
$ cargo run -- run -e '
class Animal { constructor(n){ this.n = n } speak(){ return `${this.n} makes a sound` } }
class Dog extends Animal { speak(){ return `${this.n} barks` } }
console.log(new Dog("Rex").speak());
console.log([1,2,3,4].filter(x => x % 2).map(x => x*x).reduce((a,b)=>a+b, 0));
console.log(JSON.stringify({ ok: true, items: [...new Set([1,1,2,3])] }));
'
Rex barks
10
{"ok":true,"items":[1,2,3]}It also exposes each pipeline stage, and an interactive REPL:
$ cargo run -- lex -e 'x => x * 2' # token stream
$ cargo run -- parse -e 'x => x * 2' # AST dump
$ cargo run -- disasm -e '1 + 2 * 3' # register bytecode
$ cargo run -- repl # interactive session
$ cargo run -- --helpThe disasm command shows the register bytecode the compiler emits:
$ cargo run -- disasm -e 'let s = 0; let i = 0; while (i < 3) { s += i; i += 1; } s'
chunk #0 "<main>" (regs=14, params=0)
0 LoadInt r0, 0
...
6 Lt r6, r4, r5
7 JumpIfFalse r6, +9
...
16 Jump -13
18 Return r13use kataan::parser::Parser;
use kataan::interp::Interp;
let program = Parser::parse_program("const sq = x => x * x; sq(8)").unwrap();
let mut interp = Interp::new();
assert_eq!(interp.run(&program).unwrap().to_js_string(), "64");The lower stages are available directly too:
use kataan::lexer::{Lexer, TokenKind};
let tokens = Lexer::new("let answer = 42;").tokenize().unwrap();
assert_eq!(tokens.first().unwrap().text("let answer = 42;"), "let");
assert_eq!(tokens.last().unwrap().kind, TokenKind::Eof);| Feature | Default | Description |
|---|---|---|
std |
✓ | Standard library; implies alloc. Needed by the host runtime/CLI. |
alloc |
✓ | Heap-backed types; the minimum for the pure language core. |
regex |
✓ | In-house regular-expression engine. |
intl |
✓ | In-house Intl-lite (collation, number/date formatting). |
module |
✓ | ESM + CommonJS module loader. |
host |
✓ | Host runtime: event loop, timers, console, encoding, URL, streams. |
fetch |
fetch / Node http(s) over rsurl. |
|
crypto |
crypto.getRandomValues / WebCrypto over purecrypto. |
|
jit |
Machine-code JIT (x86-64/Linux): optimizing integer + float paths. | |
ffi |
The C ABI (the only place broad unsafe is allowed). |
|
cli |
✓ | The kataan command-line tool. |
Build the bare no_std language core with:
cargo build --no-default-features --features alloccargo rustc --lib --release --features ffi --crate-type staticlib # libkataan.a
cargo rustc --lib --release --features ffi --crate-type cdylib # libkataan.soThe header is include/kataan.h; a runnable example lives
in tests/ffi_smoke.c. The C ABI follows the purecrypto
conventions — KtStatus return codes, the in/out length convention, opaque
handles, and a panic catch at every boundary.
MIT © 2026 Karpelès Lab Inc. See LICENSE.