Skip to content

Initialize MEOS per worker thread#137

Open
estebanzimanyi wants to merge 3 commits into
fix/amd64-icu-test-autoloadfrom
fix/meos-per-thread-init
Open

Initialize MEOS per worker thread#137
estebanzimanyi wants to merge 3 commits into
fix/amd64-icu-test-autoloadfrom
fix/meos-per-thread-init

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

The MEOS 1.4 uplift keeps the session timezone, errno, PROJ context and RNGs in thread-local storage and requires every thread that calls into MEOS to run meos_initialize() before its first call. The extension only initialized MEOS once on the load thread via std::call_once, so DuckDB TaskScheduler worker threads executed scalar, cast and aggregate bodies with a NULL session_timezone and segfaulted in pg_next_dst_boundary on the first timestamp parse (reproduced locally as a SIGSEGV in 042_tgeogpoint_parity, previously masked because amd64 CI never reached the tests and arm64 runs no test step). A thread-local guard now runs the per-thread init and re-installs the process-global MobilityDuck error handler (meos_initialize() resets it to the exit-on-error default), invoked from the scalar exec wrapper and through a cast registration trampoline that covers every cast entry point. Locally the full suite goes from unrunnable to 58/59 files and 1311/1312 assertions with no segfaults; the one remaining assertion is an unrelated pre-existing ln() output drift from the MEOS bump.

MEOS keeps the session timezone, errno, PROJ context and RNGs in
thread-local storage and requires every thread that calls into it to
run meos_initialize() before its first call. The extension only did
this once on the load thread, so DuckDB TaskScheduler workers ran
scalar, cast and aggregate bodies with a NULL session_timezone and
segfaulted in pg_next_dst_boundary on the first timestamp parse. A
thread-local guard now runs the per-thread init (and re-installs the
process-global error handler, which meos_initialize() resets to the
exit-on-error default) at the scalar exec wrapper and through a cast
registration trampoline covering every cast entry point.
@estebanzimanyi estebanzimanyi changed the base branch from fix/bump-meos-pin to fix/amd64-icu-test-autoload May 16, 2026 06:15
@estebanzimanyi estebanzimanyi force-pushed the fix/meos-per-thread-init branch from 54e6181 to f896b95 Compare May 16, 2026 06:15
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Reviewer's quickstart — ~8 minutes

What this PR does in one sentence: moves meos_initialize() from process-once to per-DuckDB-worker-thread, fixing the data-race where MEOS's thread-local state (timezone, error handler, lwgeom session) was uninitialised in workers spawned after the initial LoadInternal.

Files (14): mostly src/geo/*.cpp and src/temporal/*.cpp — each scalar/cast/cast-trampoline now starts with MeosThread::ensureReady() (a TLS-guarded one-time init).

Read first:

  • src/include/mobilityduck/meos_thread.hpp (new) — the TLS guard + ensureReady() definition.
  • src/include/mobilityduck/meos_exec_serial.hpp (new) — the serial-execution wrapper for paths MEOS itself isn't thread-safe on (pg-derived TZ, legacy GEOS).
  • src/mobilityduck_extension.cpp — registers the worker thread's MEOS state.
  • 11 callsite files just add MeosThread::ensureReady(); at the top of each UDF lambda body.

Verification:

# Repro pre-PR: SIGSEGV on multi-threaded queries
duckdb -unsigned -c "LOAD mobilityduck; SET threads=4; SELECT count(*) FROM (SELECT atTime(traj::VARCHAR, '2026-01-01'::TIMESTAMP) FROM ...);"
# Pre-PR: occasionally crashes. Post-PR: clean.

Cross-link: Linux arm64 needs #161 for the MeosType build error.

Why it's safe to merge: pure additive guards; valid-input paths are unaffected. The change correctly diagnoses MEOS's process-singleton initialization model as a multi-worker bug.

estebanzimanyi and others added 2 commits May 21, 2026 17:36
`meosType` (lower-case) is the **pre-consolidation** MEOS type name;
`MeosType` (upper-case) is the **post-consolidation** target that the
upstream rename sweep has not yet reached.  The current vcpkg pin
(`vcpkg_ports/meos/portfile.cmake` REF f11b7443ee98…) is still
pre-consolidation: `meos/include/temporal/meos_catalog.h` line 121
declares the typedef as `} meosType;` and every MEOS API uses the
lower-case spelling.  MobilityDuck's source code consistently uses
`meosType` to match — `grep -rn '\bMeosType\b' src/` finds the name
only on the alias line and its comment, nowhere else.

c8cad6d added `using meosType = MeosType;` as a forward-looking
bridge for the eventual consolidation bump.  That bridge points at
`MeosType`, which the current pin does NOT yet expose, so it
breaks every PR's Linux arm64 build with:

  /duckdb_build_dir/src/include/tydef.hpp:18:18:
    error: ‘MeosType’ does not name a type; did you mean ‘meosType’?

The fix is to drop the premature alias and replace the misleading
comment with one that documents the pre/post-consolidation distinction
and the resume path for the next pin bump — at that point a reviewer
can either restore the bridge (this time it'll be valid because
`MeosType` will exist) or sweep the MobilityDuck source from
`meosType` to `MeosType` in a single PR.

Unblocks every in-flight PR's Linux arm64 build: #126, #130, #149,
#158, #159, #160, plus the entire `feat/*_port_core` extended-type
stack (#148/#150/#151/#153/#155/#156).
The stage_icu helper mapped only the Linux uname values, so on the
macOS arm64 test runner uname -m returned "arm64" and the icu
extension was copied to .duckdb/extensions/v1.4.4/arm64 instead of
.../osx_arm64, where DuckDB's autoload looks. The hub fallback is not
reliably resolvable on that runner, so the osx_arm64 Test step failed
to load the extension. Map the OS and architecture to the DuckDB
platform string (linux_amd64, linux_arm64, osx_amd64, osx_arm64) so
the locally built icu is staged at the path autoload expects on every
tested platform; the Linux mapping is unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant