Skip to content

feat(spatial): Haversine + PointToSegment geodesic wrappers (stream-consumer surface)#18

Open
estebanzimanyi wants to merge 1 commit into
MobilityDB:mainfrom
estebanzimanyi:feat/spatial-haversine
Open

feat(spatial): Haversine + PointToSegment geodesic wrappers (stream-consumer surface)#18
estebanzimanyi wants to merge 1 commit into
MobilityDB:mainfrom
estebanzimanyi:feat/spatial-haversine

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

Summary

Two utility classes in utils.spatial wrapping MEOS geog_distance for the per-event spatial-predicate call sites of stream-side consumers (MobilityFlink, MobilityKafka).

Class Signature MEOS surface
Haversine.distance (lon1, lat1, lon2, lat2) -> double geog_distance(POINT, POINT)
PointToSegment.distance (pLon, pLat, s1Lon, s1Lat, s2Lon, s2Lat) -> double geog_distance(POINT, LINESTRING)

Both wrappers accept primitive doubles (lon, lat) rather than higher-level geometry objects, so callers in tight per-event loops avoid wrapping their (lon, lat) fields into JTS Geometry / TGeomPoint before each call.

Why these two specifically

Per the stream-side parity analysis (MobilityFlink #3 + MobilityKafka #1, both at 27/27 BerlinMOD-Q cells), every spatial-predicate call site in their pipelines currently uses a pure-Java fallback:

  • pure-Java haversine formula for point-to-point distance
  • pure-Java planar equirectangular projection for point-to-segment distance

Both fallbacks drift semantically from the MEOS operators used elsewhere in the same pipeline (e.g., eintersects_tgeo_geo, edwithin_tgeo_geo, nad_tgeo_geo — all WGS84 spheroidal). These two wrappers let the streaming consumers mechanically swap each TODO(meos) site to a JMEOS call so the pipeline shares spatial semantics end-to-end.

The 8 raw FFI declarations the consumers need (edwithin_tgeo_geo, eintersects_tgeo_geo, edisjoint_tgeo_geo, nad_tgeo_tgeo, nad_tgeo_geo, nad_stbox_geo, nad_stbox_stbox, tdistance_tgeo_geo, tdistance_tgeo_tgeo) are already in JMEOS main's functions.java per the multi-module merge of #9.

Tests

10 JUnit tests, all against MEOS-on-PostgreSQL ground truth (WGS84 spheroidal, use_spheroid=true):

HaversineTest (5):

  • zeroDistanceForIdenticalPoints — identical point pair → 0.0
  • shortMeridianSegment — 0.05° latitude → ~5 562 m
  • brusselsToParis — (4.35, 50.85) → (2.35, 48.86) → ~263 538 m
  • symmetricd(a, b) == d(b, a)
  • nonNegative — SF → Tokyo great-circle, ~8 270 km, positive

PointToSegmentTest (5):

  • zeroAtEndpoint — point on the segment endpoint → 0
  • zeroAtInterior — point on segment midpoint → 0
  • perpendicularDistance — off-axis ~3.5 km
  • beyondEndpointFallsBackToEndpoint — beyond-endpoint reduces to Haversine to endpoint
  • degenerateSegmentReducesToHaversines1 == s2 case
$ mvn -pl jmeos-core test -Dtest='HaversineTest,PointToSegmentTest'
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0

Branch base

Branched off main (post the #9 multi-module merge); independent of the in-flight CONFL'd PRs (#8, #12, #15, #16, #17) — those carry the codegen + multi-module structural work and don't gate this additive wrapper layer.

Downstream consumers

After this lands, the follow-up commits in MobilityFlink #3 and MobilityKafka #1 mechanically replace each TODO(meos) site with:

  • Haversine.distance(lon1, lat1, lon2, lat2) for point-to-point
  • PointToSegment.distance(pLon, pLat, s1Lon, s1Lat, s2Lon, s2Lat) for point-to-segment

Two utility classes in utils.spatial that wrap MEOS geog_distance for
the per-event spatial-predicate call sites of stream-side consumers
(MobilityFlink, MobilityKafka).  Both replace pure-Java fallbacks
(haversine formula / planar equirectangular projection) with the
canonical MEOS-on-PostgreSQL WGS84 spheroidal reference, so the
streaming pipeline shares spatial semantics with every other MEOS
operator instead of maintaining a parallel implementation that
semantically drifts.

  - Haversine.distance(lon1, lat1, lon2, lat2)
      → MEOS geog_distance on ephemeral SRID=4326 POINT geographies.
  - PointToSegment.distance(pLon, pLat, s1Lon, s1Lat, s2Lon, s2Lat)
      → MEOS geog_distance on ephemeral SRID=4326 POINT and 2-vertex
        LINESTRING geographies.

Both API contracts intentionally accept primitive doubles (lon, lat)
rather than higher-level geometry objects, so callers in tight per-event
loops avoid wrapping their (lon, lat) fields into JTS Geometry / TGeomPoint
before each call.

10 JUnit tests:
  - HaversineTest: 5 (zero, 5.5 km meridian, Brussels-Paris 264 km,
    symmetry, SF-Tokyo ~8270 km non-negativity)
  - PointToSegmentTest: 5 (endpoint zero, midpoint zero, perpendicular
    ~3.5 km, beyond-segment falls back to endpoint, degenerate segment
    reduces to point-to-point)

All against MEOS-on-PostgreSQL ground truth (WGS84 spheroidal,
use_spheroid=true).  All 10 pass against MEOS-1.4 on linux_amd64.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant