Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions GENERATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# MobilitySpark generation — the canonical per-binding generator policy

This document is the contract for how MobilitySpark is generated, under the ecosystem-wide
per-binding generator policy.

## The policy (ecosystem-wide)

Every MobilityDB language/surface binding is a **pure projection of the MEOS-API catalog**,
and **each binding owns its own generator, in its own repo**, in a canonical layout. The
single source of truth is the **catalog** (`MEOS-API/output/meos-idl.json`, generated from
the MEOS C headers). A binding is an independent, plug-and-play module that owns its
generation.

Each binding repo satisfies the same invariants: in-repo generator; own
`tools/pin/compose-order.txt`; pinned catalog/jar input; thin language projection
(language-neutral decisions live in the catalog); full automation toward a zero-hand-written
surface (generate-then-retire; the last green-CI version is the equivalence probe).

## MobilitySpark scope: generated Spark UDFs over the JMEOS surface

MobilitySpark is a **consumer** binding: it binds the **JMEOS jar** (the JVM FFI projection
of the catalog), not MEOS-API directly. Its generator **`tools/codegen_spark_udfs.py`**
mirrors the JMEOS `FunctionsGenerator`: it reads the JMEOS surface (and the catalog's
`@sqlfn` names) and emits the Spark UDF registration layer, organized **by `@ingroup` group**
(one unit per group, the same structure as the reference manual). The generator enforces its
own regularity invariant at build time (every emitted `register()` is preceded by the
per-thread MEOS-init guard).

## Generate-then-retire — the green-CI version is the probe

The hand-written `*UDFs.java` registrations are replaced by the generated surface **family
by family, never wipe-first**: generate, build green, **prove generated ⊇ hand** against the
**last green-CI version** (the test suite + the BerlinMOD benchmark), then retire the hand
registrations. End state: the UDF layer is the generated `registerAll()` — zero hand-written
registrations.

## Pinning

The JMEOS jar (and through it the catalog) is pinned to a MobilityDB `ecosystem-pin-*` /
deliverable-PR head. That pin is the *catalog/surface* input; MobilitySpark's own
`tools/pin/compose-order.txt` governs *this repo's* PR accumulate.
31 changes: 31 additions & 0 deletions tools/pin/compose-order.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# USER-APPROVED-PIN-WRITE — creating MobilitySpark's first pin manifest (user 2026-06-25,
# per-binding generator policy rollout). New file in the MobilitySpark repo, NOT a mutation
# of MobilityDB's pin tooling.
#
# MobilitySpark pin — THE canonical, dependency-ordered fold manifest (per-binding policy).
#
# MobilitySpark is a CONSUMER binding: it binds the JMEOS jar (not MEOS-API directly) and
# generates its Spark UDF surface from it. `main` predates the catalog-driven generator,
# which lives in the open stack. (policy: generator-per-binding-canonical-policy)
#
# SCOPE: MobilitySpark owns its generator IN-REPO at `tools/codegen_spark_udfs.py` (mirrors
# the JMEOS FunctionsGenerator; emits the Spark UDF registrations, organized by @ingroup).
#
# Format: <PR#> <head-branch> # role. '?' = membership/order UNCONFIRMED.
# base = current origin/main. Derived from the live DAG (gh pr list, this turn).

# ── WAVE 0 — GENERATOR ──
27 feat/spark-udf-generator # the catalog-driven Spark UDF generator (foundation)
28 feat/generated-dispatch # register the catalog-generated UDF dispatch surface (on #27)

# ── WAVE 1 — BENCHMARK (evidence vehicle; not a deliverable) ──
23 feat/berlinmod-benchmark # Spark-only BerlinMOD harness consuming the canonical suite
16 integration/berlinmod-bench # integration evidence (907/907 green)

# ── WAVE 2 — DOCS ──
8 doc/reviewer-guide # PR Reviewer Guide (uniform with MobilityDB/MobilityDuck)

# ════════════════════════════════════════════════════════════════════════════════════
# DISPOSITION: land the generator (#27) + the generated dispatch (#28); the hand UDF layer
# is retired generate-then-retire (the whole UDF layer becomes registerAll). See GENERATION.md.
# ════════════════════════════════════════════════════════════════════════════════════
21 changes: 21 additions & 0 deletions tools/regen-from-pin.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env bash
# regen-from-pin.sh — regenerate the MobilitySpark UDF layer from the catalog + JMEOS jar
# (per GENERATION.md). MobilitySpark is a JMEOS consumer.
#
# Usage: tools/regen-from-pin.sh <pin>
# env: CATALOG = path to meos-idl.json produced by MEOS-API run.py (required)
# JMEOS_JAR = path to the JMEOS jar built from the same pin (required)
#
# Invoked standalone, or by MEOS-API tools/ecosystem-generate.sh (after the JMEOS jar).
set -euo pipefail
PIN="${1:?usage: regen-from-pin.sh <pin>}"
CATALOG="${CATALOG:?set CATALOG to the meos-idl.json from MEOS-API run.py}"
JMEOS_JAR="${JMEOS_JAR:?set JMEOS_JAR to the JMEOS jar built from the same pin}"
HERE="$(cd "$(dirname "$0")/.." && pwd)"

# run the in-repo generator (tools/codegen_spark_udfs.py: --catalog --jar) -> the Spark UDF layer
python3 "$HERE/tools/codegen_spark_udfs.py" --catalog "$CATALOG" --jar "$JMEOS_JAR"

# build-verify
( cd "$HERE" && mvn -q test ) || echo "WARN: MobilitySpark mvn test returned non-zero"
echo "[spark] regenerated from catalog + JMEOS jar at pin $PIN"