Skip to content

Portable bare-name operator dialect (RFC #920)#26

Closed
estebanzimanyi wants to merge 5 commits into
MobilityDB:mainfrom
estebanzimanyi:feat/family-portable
Closed

Portable bare-name operator dialect (RFC #920)#26
estebanzimanyi wants to merge 5 commits into
MobilityDB:mainfrom
estebanzimanyi:feat/family-portable

Conversation

@estebanzimanyi

Copy link
Copy Markdown
Member

Stacked on #25 (sibling families) → #24#22. Adds the 29 canonical bare-name operator UDFs (PortableOperatorAliasUDFs), registered last so the bare names are the authoritative spelling across the portable dialect; each reuses its operator's own backing on the generated functions.GeneratedFunctions surface.

No underlying-surface change — unit suite unchanged (907/907). Review #22#24#25 first; this PR's own change is the portable commit on top.

The MobilitySpark base UDF library — temporal, geo, boxes, time and set
surfaces plus the base infrastructure — bound to the canonical single-source
functions.GeneratedFunctions surface (the MEOS-API / meos-idl.json codegen
output), bundled as libs/JMEOS-1.4.jar regenerated against the ecosystem pin,
with lib/libmeos.so built from the pin (CBUFFER/NPOINT/POSE/RGEO + H3).

Every UDF binds the generated surface directly; the legacy hand-rolled
functions.functions facade is retired. The pg_-prefixed PG-compat I/O
(pg_interval_in/out, pg_timestamptz_in) is preserved to disambiguate the
PostgreSQL built-ins of the same name.

CI builds libmeos from the ecosystem pin on Linux/macOS (with H3, and an
in-source build dir so pgtypes/postgres.h's relative ../../meos/include
resolves); Windows is non-blocking. Full unit suite green (907/907).

The th3index, sibling (cbuffer/npoint/pose/rgeo) and portable-operator families
stack on this foundation as separate changes; the BerlinMOD benchmark builds on
the full surface.
The temporal H3-index (th3index) family for MobilitySpark, stacked on the
foundation library: the h3index/th3index UDFs (Th3IndexUDFs), the H3 cell
prefilter (Th3IndexPrefilterUDFs) and the JNR bindings, registered in the
session.

th3index is backed by libmeos built with -DH3=ON (the standalone library wires
the h3 object library at the current ecosystem pin); the UDFs bind the
generated functions.GeneratedFunctions surface (th3index_*, h3index_in/out,
the canonicalized parse/to_string -> in/out). No change to the library surface,
so the unit suite is unchanged (907/907).
The four sibling temporal families for MobilitySpark, stacked on the th3index
change: CbufferUDFs, NpointUDFs, PoseUDFs and RgeoUDFs, each binding the
generated functions.GeneratedFunctions surface for its family and registered
in the session.

No change to the foundation surface, so the unit suite is unchanged (907/907).
The 29 canonical bare-name operator UDFs (PortableOperatorAliasUDFs) for
MobilitySpark, stacked on the sibling families and registered last so the
bare names are the authoritative spelling across the portable dialect.

Each bare name reuses its operator's own backing on the generated
functions.GeneratedFunctions surface. No change to the underlying surface, so
the unit suite is unchanged (907/907).
The portable contract's three comparison families were standardized to a
single camelCase shape after the initial registrar landed: temporal #= ->
tempEq..tempGe (renamed from teq..tge), ever ?= -> everEq..everGe, always %=
-> alwaysEq..alwaysGe. Register the ever/always families (delegating to the
existing PredicateUDFs temporal-temporal predicates) and re-vendor
meta/portable-aliases.json to the camelCase contract. scripts/portable_parity.py
= 41/41 backed, 0 unbacked, all six families; full suite 907/907 green.
estebanzimanyi added a commit to estebanzimanyi/MobilitySpark that referenced this pull request Jun 14, 2026
GeneratedSurfaceTest registers the catalog-generated UDFs
(GeneratedSpatioTemporalUDFs.registerAll) in a real SparkSession and asserts
results across families via spark.sql against libmeos: temporal_num_instants==3,
tint_start_value==1, tint_out renders the values, tnumber_integral is finite —
the safety gate proving the generated surface binds and executes before the hand
UDF layers (MobilityDB#22/MobilityDB#24/MobilityDB#25/MobilityDB#26) are retired. Adds junit-jupiter + surefire (fork per
class, JDK17 --add-opens) and bumps jnr-ffi to 2.2.17.

NOTE: these Spark-integration tests require JDK 17 — Spark 3.4 cannot init on
JDK 21 (DirectByteBuffer.<init>(long,int) removed) and the Java-17 JMEOS jar
cannot load on JDK 11. CI must run the Spark build/test on JDK 17.
estebanzimanyi added a commit to estebanzimanyi/MobilitySpark that referenced this pull request Jun 14, 2026
GeneratedSurfaceTest registers the catalog-generated UDFs
(GeneratedSpatioTemporalUDFs.registerAll) in a real SparkSession and asserts
results across families via spark.sql against libmeos: temporal_num_instants==3,
tint_start_value==1, tint_out renders the values, tnumber_integral is finite —
the safety gate proving the generated surface binds and executes before the hand
UDF layers (MobilityDB#22/MobilityDB#24/MobilityDB#25/MobilityDB#26) are retired. Adds junit-jupiter + surefire (fork per
class, JDK17 --add-opens) and bumps jnr-ffi to 2.2.17.

NOTE: these Spark-integration tests require JDK 17 — Spark 3.4 cannot init on
JDK 21 (DirectByteBuffer.<init>(long,int) removed) and the Java-17 JMEOS jar
cannot load on JDK 11. CI must run the Spark build/test on JDK 17.
@estebanzimanyi

Copy link
Copy Markdown
Member Author

Superseded by the catalog-driven generator (#27 foundation + #28 generated dispatch) and the rebuilt benchmark (#23), which now consumes the catalog-GENERATED UDF surface end-to-end. Per the North Star (bindings generated from the MEOS-API catalog, zero hand-written UDF surface), this hand UDF layer is retired; #23 no longer stacks on it, so there is no remaining consumer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant