Skip to content

feat: support java bindings & cross-platform fat jar compilation#711

Closed
PongPong wants to merge 12 commits into
openvenues:masterfrom
PongPong:feat/jni
Closed

feat: support java bindings & cross-platform fat jar compilation#711
PongPong wants to merge 12 commits into
openvenues:masterfrom
PongPong:feat/jni

Conversation

@PongPong
Copy link
Copy Markdown

@PongPong PongPong commented Nov 26, 2025

📋 What's New

1. Zig Build System (build.zig) - New File

 **Why:** Enable easy cross-compilation and modern build tooling alongside existing autotools

Features:

  • Native builds for development (using host system)
  • Cross-compilation step (zig build cross) for all 6 platforms
  • Builds both main library (libpostal) and JNI wrapper (libpostal_jni)
  • Optimized for size (-Os) with dead code elimination
  • Platform-specific configurations for Linux/macOS/Windows

Architecture:

  • Helper functions for code reuse: buildLibrary(),
    buildJNILibrary(), configureLibrary()
  • Flexible C flags generation based on target and optimization level
  • Clean separation of concerns with well-documented functions

Source List:
Includes 43 source files from the existing autotools build:

  • Core library: libpostal, expand, address_dictionary, etc.

  • Parsers: address_parser, language_classifier, etc.

  • Utilities: string_utils, file_utils, utf8proc, etc.

  • Scanner: klib/drand48.c, scanner.c

    2. Header Name Conflict Resolution

    Problem Discovered: Cross-compiling revealed that src/features.h conflicts with system glibc <features.h>

    Root Cause: System headers like stdlib.h include <features.h>, but with -I src in the include path (needed for JNI), the compiler found our local file instead, breaking critical system macros (__GLIBC_USE, __GNUC_PREREQ)

    Solution:

  • Renamed: src/features.csrc/libpostal_features.c

  • Renamed: src/features.hsrc/libpostal_features.h

  • Updated all includes in 11 source files (.c and .h)

  • Updated src/Makefile.am (6 source list references)

  • Updated test/test_string_utils.c

    Impact: This benefits both Zig and autotools builds - the conflict could occur in any build system that includes src/ in the include path

    3. Windows Cross-Compilation Support

    Challenge: Windows (MinGW) doesn't have POSIX drand48() but needs the custom implementation

    Solution:

  • For Windows targets: Skip config.h (which incorrectly defines HAVE_DRAND48=1 from host system)

  • Manually define HAVE_DIRENT_H (MinGW provides this)

  • Let existing src/klib/drand48.c provide the implementation when HAVE_DRAND48 is not defined

    **Result:** Windows builds work correctly with custom `drand48()`
    

    implementation

    #### 4. Additional Improvements
    
    **`src/vector.h`:**
    
  • Fixed -Wunused-parameter warning for Windows builds

  • Added (void)old_size; to explicitly mark unused parameter

    .gitignore:

  • Added Zig build artifacts: build/, /build (compiled build.zig binary)

  • Added editor backup files: *~

  • Added test artifacts: test_arraylist, test_arraylist2, configure~

  • Organized with clear section comments

    🏗️ Build Capabilitie

    Using Zig Build System:

    # Native build (for development)
    zig build
    
    # Cross-compile for all platforms
    zig build cross
    
    # Output directory structure
    zig-out/
    lib/
        linux-x86_64/       # libpostal.so + libpostal_jni.so
        linux-aarch64/      # libpostal.so + libpostal_jni.so
        macos-x86_64/       # libpostal.dylib + libpostal_jni.dylib
        macos-aarch64/      # libpostal.dylib + libpostal_jni.dylib
        windows-x86_64/     # postal.dll + postal_jni.dll
        windows-aarch64/    # postal.dll + postal_jni.dll

All platforms built successfully:

  • ✅ Linux x86_64 (6.9M)
  • ✅ Linux aarch64 (6.6M)
  • ✅ macOS x86_64 (6.8M)
  • ✅ macOS aarch64 (6.7M)
  • ✅ Windows x86_64 (6.6M)
  • ✅ Windows aarch64 (6.4M)
  • ✅ JNI wrappers for all platforms

Build time: ~7 minutes for all 6 platforms + native

🔄 Compatibility

Existing Build System:

  • ✅ Autotools (./configure && make) still works as before
  • ✅ All existing build scripts and workflows unchanged
  • ✅ The header rename (features.h → libpostal_features.h) improves compatibility by avoiding system header conflicts

Requirements:

  • Zig 0.15+ for using the new build system
  • Existing dependencies unchanged (autotools builds work as before)

🧪 Testing

Zig Build:

  • ✅ Syntax check: zig ast-check build.zig
  • ✅ Native build: zig build
  • ✅ Cross-compilation: zig build cross (31/31 steps succeed)

MinGW Build:

  • ✅ Compiles without errors after header rename
  • ⚠️ Some warnings in string_utils.c (pre-existing code quality issues with unused variables)

Autotools Build:

  • ✅ Compatible with header rename
  • ✅ All existing workflows continue to work

💡 Benefits

  • Easy Cross-Compilation: One command builds for 6 platforms
  • Modern Tooling: Zig provides better cross-compilation support than traditional toolchains
  • No Dependencies: Zig bundles cross-compilation toolchains (no need for separate gcc/clang builds)
  • Fast Incremental Builds: Zig's caching is efficient
  • Reproducible Builds: Same source produces identical binaries
  • Dual Build System: Choose between autotools (production) or Zig (development/cross-compilation)

📝 Notes

  • The Zig build system is additive - it doesn't replace or modify the existing autotools setup
  • Generated artifacts are compatible with existing distribution methods
  • The header rename (features.h → libpostal_features.h) is a bug fix that benefits all build methods
  • Build output sizes are consistent across platforms (~6-7MB per library)

🎯 Use Cases

  • Developers: Quick native builds with zig build
  • Release Managers: One-command multi-platform builds
  • CI/CD: Fast, reproducible cross-platform builds without Docker/VMs
  • JNI Users: Pre-built native libraries for all platforms

@PongPong PongPong changed the title feat: support java bindings feat: support java bindings & cross-platform fat jar compilation Nov 26, 2025
Chun Pong Lam added 3 commits November 26, 2025 16:19
Updated test/Makefile.am to reference the renamed libpostal_features.c
instead of features.c in test_libpostal_SOURCES
@albarrentine
Copy link
Copy Markdown
Contributor

this seems like an A.I. pull request (fine for small changes, tests, and bugfixes but not major refactors, etc.). This is the C repo and should not contain any binding-specific code or build steps. The Java repo is at: https://github.com/openvenues/jpostal so anything related to JNI bindings should be raised there. Adding a new build system in a different language when autotools is more standard and works fine seems weird to me although in a future version I'm experimenting with doing most/all of the config/cross-platform stuff with the preprocessor

@PongPong
Copy link
Copy Markdown
Author

@albarrentine, may I submit an MR to resolve the src/features.[c|h] conflicts and the missing drand48() issues on Windows?
I think using Zig to build a cross-platform fat JAR is still a good option — it reduces dependencies and works reliably across platforms.

I wasn’t aware of the jpostal repository before, but I’m planning to submit an MR to pull in the C source code, build it with Zig, and publish it to the Maven repository.

What do you think?

@albarrentine
Copy link
Copy Markdown
Contributor

There's some discussion on Maven packaging in jpostal and it looks like there's a working implementation though we're just waiting on a pull request (looking over it briefly I wasn't sure why the compiled shared objects needed to be checked in to the fork but not a Java/Maven user): openvenues/jpostal#7 (comment)

jpostal currently uses Gradle for the build and I think it's already working on Windows. Introducing Zig just as a build system in a non-Zig project seems like a solution looking for a problem IMHO. It's adding several hundred lines of code that need to be maintained alongside autotools and it's not clear what the performance implications are i.e. if Zig's C compiler (which is not even 1.0 yet) will produce the level of optimizations that battle-tested compilers like gcc or clang do and am not going to make that choice for everyone by default. There's a listing of C projects built with Zig at https://github.com/allyourcodebase, and there's not currently a Zig interface to libpostal, so you're welcome to make one and we'll add to the README. For libpostal and other C projects the change I'm looking to introduce to the build system is removing steps from the build so it can be run with just make.

drand48 for Windows issue is addressed by the existing autotools build. Since Windows support was initially added, the C lib now includes an implementation of drand48 and checks for it during the configure step. For Windows need to make sure to use these build instructions: https://github.com/openvenues/libpostal?tab=readme-ov-file#installation-windows. In a future version will likely rip out the klib shuffle implementation and use the xorshift implementation for the random number generator (https://github.com/goodcleanfun/random - this is tested on Linux/Mac/Windows even MSVC).

Haven't seen features.{hc} come up as a bug. <features.h> is a GNU libc thing, so I believe that would only be a conflict if we were trying to include <features.h> and expecting to use functions from there while our own "features.h" was in the same directory. Made that mistake the other day by naming a file "float.h" 🤦 and then needing something from <float.h>, but not seeing where that can happen here.

The other issue that needs to be resolved on the Java/jpostal side if interested is to consolidate the two PRs re: Java's "unique" take on UTF-8. There's one here: openvenues/jpostal#38 and another by the Overture Maps folks here: OvertureMaps/jpostal#1 so looking to consolidate those ideas into one PR and will merge that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants