Skip to content

feat: implement ST_BuildArea, ST_DelaunayTriangles, ST_ExteriorRing, ST_PointOnSurface, ST_NumInteriorRing alias#990

Open
ajaypadwal73 wants to merge 10 commits into
apache:mainfrom
ajaypadwal73:feat/geos-functions
Open

feat: implement ST_BuildArea, ST_DelaunayTriangles, ST_ExteriorRing, ST_PointOnSurface, ST_NumInteriorRing alias#990
ajaypadwal73 wants to merge 10 commits into
apache:mainfrom
ajaypadwal73:feat/geos-functions

Conversation

@ajaypadwal73

Copy link
Copy Markdown
Contributor

Summary

Implements missing GEOS-backed functions from #224:

  • ST_BuildArea — builds a polygon from linework using build_area()
  • ST_DelaunayTriangles — Delaunay triangulation with optional tolerance using delaunay_triangulation()
  • ST_ExteriorRing — returns exterior ring of a polygon using get_exterior_ring()
  • ST_PointOnSurface — returns a point guaranteed to lie on/inside the geometry using point_on_surface()
  • ST_NumInteriorRing — alias for ST_NumInteriorRings (both spellings registered)

Each function follows the existing pattern: WKB in → GEOS → WKB out via GeosExecutor + write_geos_geometry.

Changes

  • c/sedona-geos/src/lib.rs — module declarations (alphabetical order)
  • c/sedona-geos/src/register.rs — kernel registrations (alphabetical order)
  • c/sedona-geos/src/st_buildarea.rs — new
  • c/sedona-geos/src/st_delaunaytriangles.rs — new
  • c/sedona-geos/src/st_exteriorring.rs — new
  • c/sedona-geos/src/st_pointonsurface.rs — new

Closes part of #224.

@paleolimbot paleolimbot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

I'll take a proper look on Monday, but I think the only missing component is Python integration tests that check this behaviour against PostGIS. Those are in python/sedonadb/tests/functions.

…gles, ST_ExteriorRing, ST_NumInteriorRing, ST_PointOnSurface
…ointOnSurface tests

- ST_BuildArea: fix vertex ordering in expected WKT (both engines agree on
  actual output); remove POINT/LINESTRING non-area cases where PostGIS returns
  NULL but SedonaDB returns GEOMETRYCOLLECTION EMPTY (behavioral difference)
- ST_DelaunayTriangles: drop tolerance+flag params (SedonaDB kernel expects
  boolean flag, not int64); fix expected vertex order to match engine output
- ST_PointOnSurface: fix polygon-with-hole expected from POINT (1.5 1.5) to
  POINT (2 3) (both engines return point on exterior ring, not interior)
@ajaypadwal73

Copy link
Copy Markdown
Contributor Author

Hi @paleolimbot, added Python integration tests in python/sedonadb/tests/functions/test_functions.py covering ST_BuildArea, ST_DelaunayTriangles, ST_ExteriorRing, ST_NumInteriorRing, and ST_PointOnSurface - each parametrized to run against both SedonaDB and PostGIS.

CI caught a few mismatches in the initial expected values (vertex ordering, ST_PointOnSurface on a polygon-with-hole, and ST_DelaunayTriangles flag type); those are fixed in the latest commit. Ready for your review.

@paleolimbot paleolimbot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! A few notes but in general this looks great and I'm excited to have these.

Can you add an Aliases: note for ST_NumInteriorRing documentation?

Comment thread docs/reference/sql/st_buildarea.qmd Outdated
Comment on lines +30 to +32
Creates an areal geometry from linework. The input may be any collection of lines, polygons,
or points. Lines are used to build the polygon boundary, and any unclosed lines are automatically
closed. Returns NULL for non-polygon input that produces no area.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rewrite this in your own words? (Parts of this are a verbatim copy of the PostGIS documentation)

Comment thread docs/reference/sql/st_exteriorring.qmd
Comment thread python/sedonadb/tests/functions/test_functions.py
Comment thread python/sedonadb/tests/functions/test_functions.py
Comment thread python/sedonadb/tests/functions/test_functions.py
Comment thread python/sedonadb/tests/functions/test_functions.py
Comment on lines +3852 to +3871
@pytest.mark.parametrize("eng", [SedonaDB, PostGIS])
@pytest.mark.parametrize(
("geom", "expected"),
[
(None, None),
("POLYGON ((0 0, 4 0, 4 4, 0 4, 0 0))", 0),
(
"POLYGON ((0 0,6 0,6 6,0 6,0 0),(2 2,4 2,4 4,2 4,2 2))",
1,
),
("POINT (0 0)", None),
("LINESTRING (0 0, 1 1)", None),
],
)
def test_st_numinteriorring(eng, geom, expected):
eng = eng.create_or_skip()
eng.assert_query_result(
f"SELECT ST_NumInteriorRing({geom_or_null(geom)})",
expected,
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is a true alias, we should just add another assert_query_result to the existing test for this function (ensures exact behaviour for the exact same cases)

Comment on lines +54 to +61
fn return_type(&self, args: &[SedonaType]) -> Result<Option<SedonaType>> {
let arg_matchers = if self.tolerance {
vec![ArgMatcher::is_geometry(), ArgMatcher::is_numeric()]
} else {
vec![ArgMatcher::is_geometry()]
};
let matcher = ArgMatcher::new(arg_matchers, WKB_GEOMETRY);
matcher.match_args(args)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you implement this using two separate SedonaScalarKernel implementations rather than crunching them both into the same one? You can split out the logic into a common invoke_scalar().

While you're here, is it easy to add support for the "flags" parameter? We can't support the 2 (TIN) option but I believe GEOS has the MULTILINESTRING output option.

…ntOnSurface, ST_NumInteriorRing alias

- Implement ST_BuildArea via GEOS (returns GEOMETRYCOLLECTION EMPTY for non-linework in SedonaDB)
- Implement ST_DelaunayTriangles with 1-arg, 2-arg (tolerance), and 3-arg (tolerance + flags) overloads
- Implement ST_ExteriorRing returning the outer ring of a polygon
- Implement ST_PointOnSurface returning a point guaranteed to lie on the geometry
- Register ST_NumInteriorRing as alias for ST_NumInteriorRings
- Add Python integration tests for all five functions covering NULL, EMPTY, and normal inputs
- Fix ST_DelaunayTriangles flags test: PostGIS takes integer flag (0/1), SedonaDB takes boolean
- Update docs: ST_BuildArea description, ST_ExteriorRing example alignment, ST_NumInteriorRings alias wording
…aunaytriangles_flags, test_st_exteriorring, test_st_pointonsurface

- ST_BuildArea EMPTY linework: engines return empty geometry not NULL;
  split into test_st_buildarea_empty_linework with per-engine expectations
  (SedonaDB: GEOMETRYCOLLECTION EMPTY, PostGIS: POLYGON EMPTY)
- ST_DelaunayTriangles flags only_edges=True: fix edge ordering to match
  actual GEOS output (MULTILINESTRING ((0.5 1, 1 0), ...))
- ST_ExteriorRing POLYGON EMPTY: both engines return LINESTRING EMPTY not NULL
- ST_PointOnSurface MULTILINESTRING: both engines return POINT (1 1) not POINT (0.5 0.5)

@paleolimbot paleolimbot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! A few comments but this is looking good!

Comment thread c/sedona-geos/src/st_exteriorring.rs Outdated
let result = tester
.invoke_scalar("POLYGON ((0 0, 4 0, 4 4, 0 4, 0 0))")
.unwrap();
tester.assert_scalar_result_equals(result, "LINEARRING (0 0, 4 0, 4 4, 0 4, 0 0)");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably works if the wkt crate accepts LINEARRING, but LINESTRING is probably better:

Suggested change
tester.assert_scalar_result_equals(result, "LINEARRING (0 0, 4 0, 4 4, 0 4, 0 0)");
tester.assert_scalar_result_equals(result, "LINESTRING (0 0, 4 0, 4 4, 0 4, 0 0)");

vec![
ArgMatcher::is_geometry(),
ArgMatcher::is_numeric(),
ArgMatcher::is_boolean(),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works because 0 or 1 are the two values we accept, but probably this is better as is_integer.

Comment thread c/sedona-geos/src/st_delaunaytriangles.rs
Comment thread c/sedona-geos/src/st_exteriorring.rs Outdated
Comment on lines +87 to +90
fn invoke_scalar(geom: &Geometry) -> Result<Option<Vec<u8>>> {
let geom_type = geom
.geometry_type()
.map_err(|e| DataFusionError::Execution(format!("Failed to get geometry type: {e}")))?;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our other GEOS functions have an invoke_scalar() that accept a mut impl Write so that the GEOS output can be written to the output builder in place (rather than allocating a wkb byte buffer, then copying it into a new wkb byte buffer). I believe all the other functions that return a geometry do this but we may have missed some.

Comment thread c/sedona-geos/src/st_pointonsurface.rs Outdated
Comment on lines +43 to +45
Arc::new(STPointOnSurface {
matcher: ArgMatcher::new(vec![ArgMatcher::is_geography()], WKB_GEOGRAPHY),
}),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Arc::new(STPointOnSurface {
matcher: ArgMatcher::new(vec![ArgMatcher::is_geography()], WKB_GEOGRAPHY),
}),

This one shouldn't apply to geography, since the algorithm is very much a planar one and may not be correct. We also implement this for geography already.

Comment on lines +34 to +36
Computes a Delaunay triangulation of the vertices of the input geometry. The result is a
`GEOMETRYCOLLECTION` of `POLYGON` triangles. An optional tolerance value can be specified to
snap nearby vertices together before triangulation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Computes a Delaunay triangulation of the vertices of the input geometry. The result is a
`GEOMETRYCOLLECTION` of `POLYGON` triangles. An optional tolerance value can be specified to
snap nearby vertices together before triangulation.
Computes a Delaunay triangulation of the vertices of the input geometry. The result is a
`GEOMETRYCOLLECTION` of `POLYGON` triangles unless the `flags` parameter is set
to request `MULTILINESTRING` output. An optional tolerance value can be specified to
snap nearby vertices together before triangulation.

Comment thread docs/reference/sql/st_delaunaytriangles.qmd
Comment thread docs/reference/sql/st_exteriorring.qmd Outdated

## Description

Returns the exterior ring (outer boundary) of a polygon as a `LINEARRING`. Returns NULL for

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Returns the exterior ring (outer boundary) of a polygon as a `LINEARRING`. Returns NULL for
Returns the exterior ring (outer boundary) of a polygon as a `LINESTRING`. Returns NULL for

Comment thread docs/reference/sql/st_exteriorring.qmd Outdated
# under the License.

title: ST_ExteriorRing
description: Returns the exterior ring of a polygon as a linear ring geometry.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Returns the exterior ring of a polygon as a linear ring geometry.
description: Returns the exterior ring of a polygon as a linestring geometry.

Comment on lines +787 to +789
@pytest.mark.parametrize("eng", [SedonaDB, PostGIS])
def test_st_buildarea_non_linework(eng):
"""POINT input forms no closed ring: PostGIS returns NULL, SedonaDB returns GEOMETRYCOLLECTION EMPTY."""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should catch this case and return NULL as well

@ajaypadwal73

Copy link
Copy Markdown
Contributor Author

@paleolimbot Thanks - addressed the latest round:

ST_DelaunayTriangles

  • Switched the flags matcher from is_boolean to is_integer.
  • Added a Rust test covering the flags (MULTILINESTRING output) branch.
  • Docs: updated the description to mention the flags parameter and added a flags example.

ST_ExteriorRing

  • Reworked invoke_scalar to write GEOS output directly into the builder via mut impl Write, avoiding the intermediate WKB buffer copy.
  • Fixed the test assertion and docs (description + metadata) to use LINESTRING instead of LINEARRING.

ST_PointOnSurface

  • Dropped the geography matcher — it's a planar algorithm and geography is already implemented separately.

ST_BuildArea

  • Now catches the non-linework (POINT) case and returns NULL, matching PostGIS.

Let me know if anything else needs tweaking.

@paleolimbot paleolimbot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few optional nits and there are a few place I think that it's easy enough to align our behaviour with PostGIS. Thank you!

Comment on lines +770 to +780
@pytest.mark.parametrize("eng", [SedonaDB, PostGIS])
@pytest.mark.parametrize(
("geom", "sedona_expected", "postgis_expected"),
[
# Both engines return an empty geometry for empty linework, not NULL.
# SedonaDB returns GEOMETRYCOLLECTION EMPTY; PostGIS returns POLYGON EMPTY.
("LINESTRING EMPTY", "GEOMETRYCOLLECTION EMPTY", "POLYGON EMPTY"),
("MULTILINESTRING EMPTY", "GEOMETRYCOLLECTION EMPTY", "POLYGON EMPTY"),
],
)
def test_st_buildarea_empty_linework(eng, geom, sedona_expected, postgis_expected):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fix this so that we return a POLYGON EMPTY? This makes sense to me and should be possible.

Comment on lines +787 to +791
@pytest.mark.parametrize("eng", [SedonaDB, PostGIS])
@pytest.mark.parametrize(
("geom", "sedona_expected", "postgis_expected"),
[
("POINT (0 0)", None, None),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we align our behaviour here? I think this is just checking geometry types.

Comment on lines +1524 to +1531
if tolerance is None:
eng.assert_query_result(
f"SELECT ST_DelaunayTriangles({geom_or_null(geom)}, NULL)", expected
)
else:
eng.assert_query_result(
f"SELECT ST_DelaunayTriangles({geom_or_null(geom)}, {tolerance})", expected
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have to import this from sedonadb.testing but we have a helper for this one:

Suggested change
if tolerance is None:
eng.assert_query_result(
f"SELECT ST_DelaunayTriangles({geom_or_null(geom)}, NULL)", expected
)
else:
eng.assert_query_result(
f"SELECT ST_DelaunayTriangles({geom_or_null(geom)}, {tolerance})", expected
)
eng.assert_query_result(
f"SELECT ST_DelaunayTriangles({geom_or_null(geom)}, {val_or_null(tolerance)})", expected
)

fn invoke_scalar(geom: &Geometry, writer: &mut impl std::io::Write) -> Result<()> {
let result = geom
.point_on_surface()
.map_err(|e| DataFusionError::Execution(format!("ST_PointOnSurface failed: {e}")))?;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map_err(|e| DataFusionError::Execution(format!("ST_PointOnSurface failed: {e}")))?;
.map_err(|e| exec_datafusion_err!("ST_PointOnSurface failed: {e}"))?;

) -> Result<()> {
let result = geom
.delaunay_triangulation(tolerance, only_edges)
.map_err(|e| DataFusionError::Execution(format!("ST_DelaunayTriangles failed: {e}")))?;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map_err(|e| DataFusionError::Execution(format!("ST_DelaunayTriangles failed: {e}")))?;
.map_err(|e| exec_datafusion_err!("ST_DelaunayTriangles failed: {e}"))?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants