Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 37 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,43 @@ EXT_CONFIG=${PROJ_DIR}extension_config.cmake
# Include the Makefile from extension-ci-tools
include extension-ci-tools/makefiles/duckdb_extension.Makefile

# Single-timezone model (PGTZ-style): the extension's LoadInternal forces
# both MEOS (meos_initialize_timezone) and DuckDB (DBConfig::SetOptionByName
# "TimeZone") to Europe/Brussels. Tests pass on any OS timezone — the
# extension is the single source of truth, no TZ env var needed.
#
# LoadInternal also calls ExtensionHelper::AutoLoadExtension(db, "icu") so
# the timezone option is honoured. Autoload looks for the extension on disk
# at $HOME/.duckdb/extensions/<duckdb_version>/<platform>/icu.duckdb_extension
# and falls back to a hub download. That fails both inside the linux_amd64
# test docker container (empty path, no network egress) and on the macOS
# osx_arm64 test runner (hub icu not reliably resolvable). We copy the
# icu.duckdb_extension that was built locally as part of this extension's
# build (declared in extension_config.cmake) into the expected path,
# matched to the DuckDB platform string, before running the unittester.
DUCKDB_VERSION_TAG := v1.4.4

define stage_icu
@if [ -f ./build/$(1)/extension/icu/icu.duckdb_extension ]; then \
case "$$(uname -s)-$$(uname -m)" in \
Linux-x86_64) platform=linux_amd64 ;; \
Linux-aarch64) platform=linux_arm64 ;; \
Darwin-arm64) platform=osx_arm64 ;; \
Darwin-x86_64) platform=osx_amd64 ;; \
*) platform=$$(uname -m) ;; \
esac; \
target=$$HOME/.duckdb/extensions/$(DUCKDB_VERSION_TAG)/$$platform; \
mkdir -p "$$target" && cp -f ./build/$(1)/extension/icu/icu.duckdb_extension "$$target/" && \
echo "Staged icu.duckdb_extension at $$target/"; \
fi
endef

test_release_internal:
TZ=UTC ./build/release/$(TEST_PATH) "$(PROJ_DIR)test/*"
$(call stage_icu,release)
./build/release/$(TEST_PATH) "$(PROJ_DIR)test/*"
test_debug_internal:
TZ=UTC ./build/debug/$(TEST_PATH) "$(PROJ_DIR)test/*"
$(call stage_icu,debug)
./build/debug/$(TEST_PATH) "$(PROJ_DIR)test/*"
test_reldebug_internal:
TZ=UTC ./build/reldebug/$(TEST_PATH) "$(PROJ_DIR)test/*"
$(call stage_icu,reldebug)
./build/reldebug/$(TEST_PATH) "$(PROJ_DIR)test/*"
74 changes: 18 additions & 56 deletions docs/parity-status.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MobilityDuck parity status — surface-level audit

Generated 2026-05-11. **Active addressable scope** (temporal + geo, excluding PG-only helpers): 929/943 names covered (98.5%).
Generated 2026-05-11. **Active addressable scope** (temporal + geo, excluding PG-only helpers): 943/943 names covered (100.0%).

**Out of scope** (PG-only — no DuckDB equivalent exists): 315 names skipped — 84 from PG-only sections (GiST/SPGiST opclasses, set/span/spanset index files, `019_geo_constructors.in.sql` PG geometric types, `999_oid_cache.in.sql`) plus 231 PG helper functions inside active sections (`*_in/_out/_recv/_send`, `*_transfn/_combinefn/_finalfn/_serialize/_deserialize`, `*_sel/_joinsel/_supportfn/_analyze`, `*_typmod_in/_typmod_out`). Listed in appendix B; not counted in the headline.

Expand All @@ -20,18 +20,18 @@ Per-section counts: `Addressable` = MDB names minus PG-only helpers (see appendi

| Section | Addressable | Covered | Missing | Coverage | OOS | MDB operators |
|---|---:|---:|---:|---:|---:|---:|
| `geo/050_geoset.in.sql` | 42 | 41 | 1 | 98% | 13 | 46 |
| `geo/051_stbox.in.sql` | 73 | 70 | 3 | 96% | 10 | 29 |
| `geo/050_geoset.in.sql` | 42 | 42 | 0 | 100% | 13 | 46 |
| `geo/051_stbox.in.sql` | 73 | 73 | 0 | 100% | 10 | 29 |
| `geo/052_tgeo.in.sql` | 68 | 68 | 0 | 100% | 11 | 12 |
| `geo/052_tpoint.in.sql` | 69 | 69 | 0 | 100% | 9 | 12 |
| `geo/053_tgeo_inout.in.sql` | 18 | 18 | 0 | 100% | 0 | 0 |
| `geo/053_tpoint_inout.in.sql` | 18 | 18 | 0 | 100% | 0 | 0 |
| `geo/054_tgeo_compops.in.sql` | 6 | 6 | 0 | 100% | 1 | 36 |
| `geo/054_tpoint_compops.in.sql` | 6 | 6 | 0 | 100% | 0 | 36 |
| `geo/056_tgeo_spatialfuncs.in.sql` | 16 | 15 | 1 | 94% | 0 | 0 |
| `geo/056_tpoint_spatialfuncs.in.sql` | 28 | 27 | 1 | 96% | 1 | 0 |
| `geo/058_tgeo_tile.in.sql` | 5 | 4 | 1 | 80% | 0 | 0 |
| `geo/058_tpoint_tile.in.sql` | 11 | 10 | 1 | 91% | 0 | 0 |
| `geo/056_tgeo_spatialfuncs.in.sql` | 16 | 16 | 0 | 100% | 0 | 0 |
| `geo/056_tpoint_spatialfuncs.in.sql` | 28 | 28 | 0 | 100% | 1 | 0 |
| `geo/058_tgeo_tile.in.sql` | 5 | 5 | 0 | 100% | 0 | 0 |
| `geo/058_tpoint_tile.in.sql` | 11 | 11 | 0 | 100% | 0 | 0 |
| `geo/060_tgeo_boxops.in.sql` | 13 | 13 | 0 | 100% | 0 | 50 |
| `geo/060_tpoint_boxops.in.sql` | 13 | 13 | 0 | 100% | 0 | 50 |
| `geo/062_tgeo_posops.in.sql` | 16 | 16 | 0 | 100% | 0 | 76 |
Expand All @@ -46,7 +46,7 @@ Per-section counts: `Addressable` = MDB names minus PG-only helpers (see appendi
| `geo/072_tgeo_tempspatialrels.in.sql` | 6 | 6 | 0 | 100% | 0 | 0 |
| `geo/072_tpoint_tempspatialrels.in.sql` | 5 | 5 | 0 | 100% | 0 | 0 |
| `geo/076_tgeo_analytics.in.sql` | 12 | 12 | 0 | 100% | 0 | 0 |
| `geo/076_tpoint_analytics.in.sql` | 18 | 17 | 1 | 94% | 0 | 0 |
| `geo/076_tpoint_analytics.in.sql` | 18 | 18 | 0 | 100% | 0 | 0 |
| `geo/078_tpoint_datagen.in.sql` | 0 | 0 | 0 | 0% | 1 | 0 |
| `temporal/001_set.in.sql` | 47 | 47 | 0 | 100% | 35 | 38 |
| `temporal/002_set_ops.in.sql` | 11 | 11 | 0 | 100% | 0 | 176 |
Expand All @@ -58,7 +58,7 @@ Per-section counts: `Addressable` = MDB names minus PG-only helpers (see appendi
| `temporal/021_tbox.in.sql` | 52 | 52 | 0 | 100% | 8 | 21 |
| `temporal/022_temporal.in.sql` | 101 | 101 | 0 | 100% | 16 | 24 |
| `temporal/023_temporal_inout.in.sql` | 16 | 16 | 0 | 100% | 0 | 0 |
| `temporal/025_temporal_tile.in.sql` | 16 | 11 | 5 | 69% | 0 | 0 |
| `temporal/025_temporal_tile.in.sql` | 16 | 16 | 0 | 100% | 0 | 0 |
| `temporal/026_tnumber_mathfuncs.in.sql` | 17 | 17 | 0 | 100% | 0 | 24 |
| `temporal/028_tbool_boolops.in.sql` | 4 | 4 | 0 | 100% | 0 | 7 |
| `temporal/029_ttext_textfuncs.in.sql` | 4 | 4 | 0 | 100% | 0 | 3 |
Expand All @@ -70,48 +70,10 @@ Per-section counts: `Addressable` = MDB names minus PG-only helpers (see appendi
| `temporal/040_temporal_aggfuncs.in.sql` | 0 | 0 | 0 | 0% | 40 | 0 |
| `temporal/042_temporal_waggfuncs.in.sql` | 0 | 0 | 0 | 0% | 8 | 0 |
| `temporal/046_temporal_analytics.in.sql` | 4 | 4 | 0 | 100% | 0 | 0 |
| **TOTAL (active)** | **943** | **929** | **14** | **99%** | **231** | — |
| **TOTAL (active)** | **943** | **943** | **0** | **100%** | **231** | — |

## Missing function names per active section

### `geo/050_geoset.in.sql` — 1 missing of 42 addressable (98% covered)

- `transformPipeline` (2 overloads)

### `geo/051_stbox.in.sql` — 3 missing of 73 addressable (96% covered)

- `geography`
- `perimeter`
- `quadSplit`

### `geo/056_tgeo_spatialfuncs.in.sql` — 1 missing of 16 addressable (94% covered)

- `transformPipeline` (2 overloads)

### `geo/056_tpoint_spatialfuncs.in.sql` — 1 missing of 28 addressable (96% covered)

- `transformPipeline` (3 overloads)

### `geo/058_tgeo_tile.in.sql` — 1 missing of 5 addressable (80% covered)

- `timeBoxes`

### `geo/058_tpoint_tile.in.sql` — 1 missing of 11 addressable (91% covered)

- `timeBoxes`

### `geo/076_tpoint_analytics.in.sql` — 1 missing of 18 addressable (94% covered)

- `geography` (2 overloads)

### `temporal/025_temporal_tile.in.sql` — 5 missing of 16 addressable (69% covered)

- `timeBins` (4 overloads)
- `timeBoxes` (2 overloads)
- `valueBins` (2 overloads)
- `valueBoxes` (2 overloads)
- `valueTimeBoxes` (2 overloads)

## Appendix B — Out of scope (PG-only, no DuckDB equivalent)

These entries are PG-specific helpers — index opclasses, aggregate transition/combine/final/serialize callbacks, planner hooks (`_sel`, `_joinsel`, `_supportfn`, `_analyze`), text/binary I/O helpers (`_in`, `_out`, `_recv`, `_send`), type modifier helpers, the `999_oid_cache` PG catalog hook, and PG geometric type constructors (`019_geo_constructors`). None of them have DuckDB equivalents and they should not be implemented; listed here only for completeness.
Expand Down Expand Up @@ -162,11 +124,11 @@ These families (cbuffer, npoint, pose, rgeo) are deferred until the active tempo

| Section | Addressable | Covered | Missing | Coverage |
|---|---:|---:|---:|---:|
| `cbuffer/150_cbuffer.in.sql` | 31 | 7 | 24 | 23% |
| `cbuffer/151_cbufferset.in.sql` | 42 | 32 | 10 | 76% |
| `cbuffer/150_cbuffer.in.sql` | 31 | 8 | 23 | 26% |
| `cbuffer/151_cbufferset.in.sql` | 42 | 33 | 9 | 79% |
| `cbuffer/152_tcbuffer.in.sql` | 84 | 66 | 18 | 79% |
| `cbuffer/154_tcbuffer_compops.in.sql` | 6 | 6 | 0 | 100% |
| `cbuffer/155_tcbuffer_spatialfuncs.in.sql` | 9 | 6 | 3 | 67% |
| `cbuffer/155_tcbuffer_spatialfuncs.in.sql` | 9 | 7 | 2 | 78% |
| `cbuffer/158_tcbuffer_topops.in.sql` | 7 | 7 | 0 | 100% |
| `cbuffer/159_tcbuffer_posops.in.sql` | 12 | 12 | 0 | 100% |
| `cbuffer/160_tcbuffer_distance.in.sql` | 5 | 4 | 1 | 80% |
Expand All @@ -186,24 +148,24 @@ These families (cbuffer, npoint, pose, rgeo) are deferred until the active tempo
| `npoint/093_tnpoint_distance.in.sql` | 4 | 4 | 0 | 100% |
| `npoint/095_tnpoint_aggfuncs.in.sql` | 8 | 0 | 8 | 0% |
| `npoint/098_tnpoint_indexes.in.sql` | 1 | 0 | 1 | 0% |
| `pose/100_pose.in.sql` | 34 | 10 | 24 | 29% |
| `pose/101_poseset.in.sql` | 46 | 33 | 13 | 72% |
| `pose/100_pose.in.sql` | 34 | 11 | 23 | 32% |
| `pose/101_poseset.in.sql` | 46 | 34 | 12 | 74% |
| `pose/102_tpose.in.sql` | 84 | 65 | 19 | 77% |
| `pose/104_tpose_compops.in.sql` | 6 | 6 | 0 | 100% |
| `pose/105_tpose_spatialfuncs.in.sql` | 8 | 7 | 1 | 88% |
| `pose/105_tpose_spatialfuncs.in.sql` | 8 | 8 | 0 | 100% |
| `pose/108_tpose_topops.in.sql` | 7 | 7 | 0 | 100% |
| `pose/109_tpose_posops.in.sql` | 16 | 16 | 0 | 100% |
| `pose/111_tpose_aggfuncs.in.sql` | 7 | 0 | 7 | 0% |
| `pose/113_tpose_distance.in.sql` | 4 | 4 | 0 | 100% |
| `pose/114_tpose_indexes.in.sql` | 1 | 0 | 1 | 0% |
| `rgeo/122_trgeo.in.sql` | 83 | 65 | 18 | 78% |
| `rgeo/124_trgeo_compops.in.sql` | 6 | 6 | 0 | 100% |
| `rgeo/125_trgeo_spatialfuncs.in.sql` | 4 | 3 | 1 | 75% |
| `rgeo/125_trgeo_spatialfuncs.in.sql` | 4 | 4 | 0 | 100% |
| `rgeo/128_trgeo_topops.in.sql` | 5 | 5 | 0 | 100% |
| `rgeo/129_trgeo_posops.in.sql` | 12 | 12 | 0 | 100% |
| `rgeo/131_trgeo_aggfuncs.in.sql` | 7 | 0 | 7 | 0% |
| `rgeo/133_trgeo_distance.in.sql` | 4 | 4 | 0 | 100% |
| `rgeo/133_trgeo_vclip.in.sql` | 6 | 0 | 6 | 0% |
| `rgeo/134_trgeo_indexes.in.sql` | 1 | 0 | 1 | 0% |
| **TOTAL (deferred)** | **782** | **542** | **240** | **69%** |
| **TOTAL (deferred)** | **782** | **549** | **233** | **70%** |

59 changes: 57 additions & 2 deletions src/geo/geoset.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -112,11 +112,28 @@ void SpatialSetType::RegisterScalarFunctions(ExtensionLoader &loader) {
{SpatialSetType::geomset(), LogicalType::INTEGER}, SpatialSetType::geomset(), SpatialSetFunctions::Spatialset_transform));

duckdb::RegisterSerializedScalarFunction(loader, ScalarFunction(
"transform",
"transform",
{SpatialSetType::geogset(), LogicalType::INTEGER}, SpatialSetType::geogset(), SpatialSetFunctions::Spatialset_transform));

// transformPipeline(<geomset|geogset>, pipeline text, srid int = 0,
// is_forward bool = true)
for (auto &set_type : {SpatialSetType::geomset(), SpatialSetType::geogset()}) {
duckdb::RegisterSerializedScalarFunction(loader, ScalarFunction(
"transformPipeline",
{set_type, LogicalType::VARCHAR},
set_type, SpatialSetFunctions::Spatialset_transform_pipeline));
duckdb::RegisterSerializedScalarFunction(loader, ScalarFunction(
"transformPipeline",
{set_type, LogicalType::VARCHAR, LogicalType::INTEGER},
set_type, SpatialSetFunctions::Spatialset_transform_pipeline));
duckdb::RegisterSerializedScalarFunction(loader, ScalarFunction(
"transformPipeline",
{set_type, LogicalType::VARCHAR, LogicalType::INTEGER, LogicalType::BOOLEAN},
set_type, SpatialSetFunctions::Spatialset_transform_pipeline));
}

duckdb::RegisterSerializedScalarFunction(loader, ScalarFunction(
"startValue", {SpatialSetType::geomset()},
"startValue", {SpatialSetType::geomset()},
GeoTypes::GEOMETRY(),
SpatialSetFunctions::Set_start_value
));
Expand Down Expand Up @@ -451,6 +468,44 @@ void SpatialSetFunctions::Spatialset_transform(DataChunk &args, ExpressionState
}
}

/* transformPipeline(<spatial-set>, pipeline text, srid int = 0,
* is_forward bool = true)
* Apply a PROJ pipeline string to every element of the spatial set.
*/
void SpatialSetFunctions::Spatialset_transform_pipeline(DataChunk &args, ExpressionState &state, Vector &result_vec) {
const idx_t row_count = args.size();
for (idx_t i = 0; i < args.ColumnCount(); i++) args.data[i].Flatten(row_count);
const idx_t cc = args.ColumnCount();
auto in_set = FlatVector::GetData<string_t>(args.data[0]);
auto in_pipe = FlatVector::GetData<string_t>(args.data[1]);
auto &v0 = FlatVector::Validity(args.data[0]);
auto &v1 = FlatVector::Validity(args.data[1]);
auto out_data = FlatVector::GetData<string_t>(result_vec);
auto &out_validity = FlatVector::Validity(result_vec);
for (idx_t row = 0; row < row_count; row++) {
if (!v0.RowIsValid(row) || !v1.RowIsValid(row)) {
out_validity.SetInvalid(row);
continue;
}
size_t sz = in_set[row].GetSize();
Set *s = (Set *) malloc(sz);
memcpy(s, in_set[row].GetData(), sz);
int32_t srid = (cc > 2) ? FlatVector::GetData<int32_t>(args.data[2])[row] : 0;
bool is_fwd = (cc > 3) ? FlatVector::GetData<bool>(args.data[3])[row] : true;
std::string pipe = in_pipe[row].GetString();
Set *ret = spatialset_transform_pipeline(s, pipe.c_str(), srid, is_fwd);
free(s);
if (!ret) {
out_validity.SetInvalid(row);
continue;
}
size_t rsz = set_mem_size(ret);
out_data[row] = StringVector::AddStringOrBlob(result_vec, (const char *) ret, rsz);
free(ret);
}
if (row_count == 1) result_vec.SetVectorType(VectorType::CONSTANT_VECTOR);
}

// --- startValue ---
void SpatialSetFunctions::Set_start_value(DataChunk &args, ExpressionState &state, Vector &result) {
auto &input = args.data[0];
Expand Down
37 changes: 37 additions & 0 deletions src/geo/stbox.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,43 @@ void StboxType::RegisterScalarFunctions(ExtensionLoader &loader) {
ScalarFunction("SRID", {STBOX()}, LogicalType::INTEGER,
StboxFunctions::Stbox_srid));

// perimeter(stbox [, spheroid bool]) — sum of edge lengths.
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("perimeter", {STBOX()}, LogicalType::DOUBLE,
StboxFunctions::Stbox_perimeter));
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("perimeter", {STBOX(), LogicalType::BOOLEAN},
LogicalType::DOUBLE, StboxFunctions::Stbox_perimeter));

// quadSplit(stbox) — split the spatial extent into four quadrants
// (each with the original time span), returning an stbox[].
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("quadSplit", {STBOX()},
LogicalType::LIST(STBOX()),
StboxFunctions::Stbox_quad_split));

// geography(stbox) — same C entrypoint as `geometry(stbox)`; DuckDB
// has no separate geography type so both routes produce a GEOMETRY
// blob. Registered for naming parity with MobilityDB.
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("geography", {STBOX()}, GeoTypes::GEOMETRY(),
StboxFunctions::Stbox_to_geo));

// transformPipeline(stbox, pipeline text, srid int = 0,
// is_forward bool = true)
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("transformPipeline",
{STBOX(), LogicalType::VARCHAR},
STBOX(), StboxFunctions::Stbox_transform_pipeline));
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("transformPipeline",
{STBOX(), LogicalType::VARCHAR, LogicalType::INTEGER},
STBOX(), StboxFunctions::Stbox_transform_pipeline));
duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction("transformPipeline",
{STBOX(), LogicalType::VARCHAR, LogicalType::INTEGER, LogicalType::BOOLEAN},
STBOX(), StboxFunctions::Stbox_transform_pipeline));

duckdb::RegisterSerializedScalarFunction(loader,
ScalarFunction(
"shiftTime",
Expand Down
Loading
Loading