Skip to content

JIT: Add support for WASM32#2260

Open
pguyot wants to merge 10 commits intoatomvm:release-0.7from
pguyot:w14/jit-wasm
Open

JIT: Add support for WASM32#2260
pguyot wants to merge 10 commits intoatomvm:release-0.7from
pguyot:w14/jit-wasm

Conversation

@pguyot
Copy link
Copy Markdown
Collaborator

@pguyot pguyot commented Apr 6, 2026

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@github-advanced-security
Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

@pguyot pguyot force-pushed the w14/jit-wasm branch 2 times, most recently from cb181a7 to 65b7bd7 Compare April 7, 2026 19:51
@petermm
Copy link
Copy Markdown
Contributor

petermm commented Apr 9, 2026

AMP:

PR Review: WASM32 JIT Backend for AtomVM

Reviewer: AI-assisted review
Date: 2026-04-09
Commits reviewed: 7 (2c2428f..65b7bd7)
Author: Paul Guyot <pguyot@kallisys.net>


Summary

This PR adds a complete WASM32 JIT backend to AtomVM, allowing JIT compilation when running under Emscripten (browser/Node.js). The key architectural decision is that WASM cannot use function pointer arithmetic like native backends, so a label-based continuation encoding (label + 1) replaces direct function pointers, and per-thread WASM compilation is used because Emscripten pthreads have separate wasmTables.

Commits

# SHA Title Files Changed Lines
1 2c2428f9d JIT: add WASM32 support to jit_tests_common 1 +217/-49
2 f19d34595 JIT: add WASM32 assembler and tests 2 +902
3 8125935e5 JIT: allow backends to disable tail cache 1 +34/-14
4 3dbb496b1 JIT: reorder reduction decrement before register reads in apply 1 +14/-14
5 54d5f5464 JIT: record line info at continuation labels after function splits 1 +28/-9
6 331f9cc34 JIT: add WASM32 JIT backend, emscripten integration, and CI 29 +4802/-45
7 65b7bd746 emscripten: allow heap to grow, starts with 64MB 1 +1/-1

Architecture Assessment: ✅ Sound

The JIT_JUMPTABLE_IS_DATA abstraction cleanly separates the WASM model from native backends:

  • Native backends: continuation is a function pointer; jump table entries are executable code
  • WASM backend: continuation stores label + 1; jump table is data; per-thread resolution via jit_wasm_get_entry_point()

The label + 1 encoding is consistent:

  • Encode: (NativeContinuation)(label + 1) via JIT_CONTINUATION_FOR_LABEL macro
  • Decode: (int)continuation - 1 at dispatch sites
  • 0 is reserved as "no continuation" — no off-by-one risk in normal paths

Issues Found

🔴 Critical: Native Code Memory Leak on Module Unload

compile_wasm_stream() allocates 4 objects that are never freed:

// jit_stream_wasm.c — compile_wasm_stream()
struct JITWasmHeader *header = malloc(sizeof(struct JITWasmHeader));  // (1)
header->wasm_binary = malloc(wasm_size);                              // (2)
header->lines_metadata = malloc(lines_data_size);                     // (3)
ModuleNativeEntryPoint *block = calloc(num_entries + 1, ...);         // (4)

But module_destroy() never frees any of these:

// module.c:1292-1311
COLD_FUNC void module_destroy(Module *module)
{
    free(module->labels);
    free(module->imported_funcs);
    free(module->literals_table);
    free(module->local_atoms_to_global_table);
    free(module->line_refs_offsets);
    // ... no mention of native_code
    free(module);
}

Recommendation: Add a platform hook sys_release_native_code(ModuleNativeEntryPoint) called from module_destroy() when module->native_code != NULL.


🔴 Critical: JS addFunction Handles Never Freed

Per-thread addFunction() calls grow the WASM table indefinitely:

// jit_stream_wasm.c — EM_JS jit_get_thread_func_ptr
cache[i] = addFunction(func, 'iiii');  // never removeFunction()'d

Module._jitCache is keyed by raw wasm_binary pointer. If C-side memory is freed and reused, stale cache entries could reference invalid WASM instances.

Recommendation: Add removeFunction() cleanup when modules are unloaded, and key the cache by a stable module ID rather than a raw pointer.


🟡 Important: NULL Function Pointer Not Guarded in Dispatch Loop

If per-thread WASM compilation fails, jit_wasm_get_entry_point() returns NULL:

// jit_stream_wasm.c:132-136
ModuleNativeEntryPoint jit_wasm_get_entry_point(const void *native_code, int label)
{
    // ...
    int fp = jit_get_thread_func_ptr(...);
    return (ModuleNativeEntryPoint) (uintptr_t) fp;  // 0 if compilation failed
}

The dispatch loop in opcodesswitch.h then calls through this NULL pointer:

// opcodesswitch.h:1770-1775 (JIT_JUMPTABLE_IS_DATA path)
int label = (int) jit_state.continuation - 1;
native_pc = module_get_native_entry_point(mod, label);
// native_pc may be NULL — no guard before call

Recommendation: Add if (UNLIKELY(native_pc == NULL)) { abort/raise_error; } after resolution.


🟡 Important: Metadata Parsing Has No Bounds Checking

Line info and cont_label_map are parsed from lines_metadata without knowing the buffer size:

// module.c:989-1009 — module_get_function_from_label (WASM path)
uint16_t lines_count = metadata[0] | (metadata[1] << 8);
const uint8_t *cont_map_ptr = metadata + 2 + lines_count * 6;
uint16_t cont_map_count = cont_map_ptr[0] | (cont_map_ptr[1] << 8);
// No size validation — malformed metadata → OOB read

Recommendation: Store lines_metadata_size in JITWasmHeader and validate offsets during parsing.


🟢 Non-Issue: defaultatoms.def Length Byte "Collision"

The \xA and \xB bytes are string length prefixes, not unique IDs:

X(JIT_ARMV6M_ATOM, "\xA", "jit_armv6m")   // 10 chars → \xA ✓
X(JIT_WASM32_ATOM, "\xA", "jit_wasm32")   // 10 chars → \xA ✓
X(JIT_RISCV32_ATOM, "\xB", "jit_riscv32") // 11 chars → \xB ✓
X(JIT_RISCV64_ATOM, "\xB", "jit_riscv64") // 11 chars → \xB ✓

This is correct. Atoms are distinguished by their string content, not the length byte.


🟢 Non-Issue: ModuleFunction Union Safety

// exportedfunction.h
struct ModuleFunction {
    struct ExportedFunction base;
    Module *target;
    union {
        ModuleNativeEntryPoint entry_point;
        int label;
    };
};

On WASM, only label is written/read. On native, only entry_point is written/read. The active member is platform-discriminated. Safe as used.


Per-Commit Review

Commit 1: 2c2428f9d — WASM32 support in jit_tests_common

Assessment: ✅ Good

Adds WAT-based cross-validation for WASM instruction encoding using wat2wasm + wasm-objdump. Cleanly extends the existing multi-arch test infrastructure.

The wasm_code_bytes/1 helper correctly parses WASM binary module format to extract instruction bytes from the code section.

Minor: The find_wat2wasm/0 function uses os:cmd("which wat2wasm") which returns a non-empty string even on failure on some systems. Consider checking exit code.


Commit 2: f19d34595 — WASM32 assembler and tests

Assessment: ✅ Good

jit_wasm32_asm.erl (467 lines) provides LEB128 encoding, WASM instruction emission, and module structure building. jit_wasm32_asm_tests.erl (435 lines) covers individual instruction encoding, LEB128 edge cases, and module structure.

Test coverage is solid for encoding correctness. Each instruction has a corresponding test with expected binary output cross-validated against wat2wasm.


Commit 3: 8125935e5 — Allow backends to disable tail cache

Assessment: ✅ Good

Clean abstraction. The tail_cache_find/2 and tail_cache_store/3 helpers handle disabled transparently:

-type tail_cache() :: [{tuple(), non_neg_integer()}] | disabled.

tail_cache_find(_Key, disabled) -> false;
tail_cache_find(Key, TC) -> lists:keyfind(Key, 1, TC).

tail_cache_store(_Key, _Value, disabled) -> disabled;
tail_cache_store(Key, Value, TC) -> [{Key, Value} | TC].

Backend opt-in via supports_tail_cache/0 with erlang:function_exported/3 fallback is the right pattern for optional callbacks.


Commit 4: 3dbb496b1 — Reorder reduction decrement in apply

Assessment: ✅ Correct fix

Moves decrement_reductions_and_maybe_schedule_next before register reads in OP_APPLY, OP_APPLY_LAST, and OP_CALL_FUN2. This matches the interpreter's order and is required because WASM function splitting means a yield point can split the function, so registers must not be read before the yield check.

 first_pass(<<?OP_APPLY, Rest0/binary>>, MMod, MSt0, State0) ->
     ?ASSERT_ALL_NATIVE_FREE(MSt0),
     {Arity, Rest1} = decode_literal(Rest0),
-    {MSt1, Module} = read_any_xreg(Arity, MMod, MSt0),
-    {MSt2, Function} = read_any_xreg(Arity + 1, MMod, MSt1),
     ?TRACE("OP_APPLY ~p\n", [Arity]),
-    MSt3 = verify_is_atom(Module, 0, MMod, MSt2),
-    MSt4 = verify_is_atom(Function, 0, MMod, MSt3),
-    MSt5 = MMod:decrement_reductions_and_maybe_schedule_next(MSt4),
+    MSt1 = MMod:decrement_reductions_and_maybe_schedule_next(MSt0),
+    {MSt2, Module} = read_any_xreg(Arity, MMod, MSt1),
+    {MSt3, Function} = read_any_xreg(Arity + 1, MMod, MSt2),
+    MSt4 = verify_is_atom(Module, 0, MMod, MSt3),
+    MSt5 = verify_is_atom(Function, 0, MMod, MSt4),

Commit 5: 54d5f5464 — Record line info at continuation labels

Assessment: ✅ Good

Adds current_line tracking and record_continuation_line/3 to emit line info at function-splitting continuation points. This ensures that WASM's split functions have accurate line info for stacktraces:

record_continuation_line(_MMod, _MSt, #state{current_line = undefined} = State) ->
    State;
record_continuation_line(MMod, MSt, #state{current_line = Line, line_offsets = AccLines} = State) ->
    Offset = MMod:offset(MSt),
    State#state{line_offsets = [{Line, Offset} | AccLines]}.

Applied at all call/apply/fun opcodes that create continuation points. On native backends this is a no-op since the offset is within the same function.


Commit 6: 331f9cc34 — WASM32 JIT backend + CI

Assessment: ✅ Good architecture, issues noted above

This is the main commit (~4800 lines). The Erlang backend (jit_wasm32.erl, 1866 lines) generates WebAssembly bytecode where each BEAM label compiles to a separate WASM function. The C runtime changes add JIT_JUMPTABLE_IS_DATA support throughout the dispatch loop, continuation handling, and module management.

CI configuration is thorough:

  • Compiles JIT tests on host with wasm32 target arch
  • Runs eunit tests on host (BEAM)
  • Builds Emscripten with and without JIT (matrix)
  • Runs hello_world + library tests under JIT
  • CodeQL only runs for non-JIT to avoid false positives

#ifdef nesting concern: The opcodesswitch.h changes add nested #ifdef JIT_JUMPTABLE_IS_DATA inside #ifndef AVM_NO_JIT inside #ifndef AVM_NO_EMU blocks. This is getting deep but is manageable for now. Consider centralizing encode/decode helpers to reduce future spread:

// Suggested helpers to reduce #ifdef repetition:
static inline NativeContinuation jit_continuation_from_label(Module *mod, int label) {
#ifdef JIT_JUMPTABLE_IS_DATA
    return (NativeContinuation)(label + 1);
#else
    return module_get_native_entry_point(mod, label);
#endif
}

static inline ModuleNativeEntryPoint jit_resolve_continuation(Module *mod, NativeContinuation cont) {
#ifdef JIT_JUMPTABLE_IS_DATA
    return module_get_native_entry_point(mod, (int)cont - 1);
#else
    return cont;
#endif
}

Commit 7: 65b7bd746 — Emscripten heap growth

Assessment: ✅ Good

Simple, necessary change for JIT workloads:

-target_link_options(AtomVM PRIVATE ${JIT_LINK_FLAGS} -sUSE_ZLIB=1 -O3 -pthread -sFETCH -lwebsocket.js --pre-js ${CMAKE_CURRENT_SOURCE_DIR}/atomvm.pre.js)
+target_link_options(AtomVM PRIVATE ${JIT_LINK_FLAGS} -sUSE_ZLIB=1 -O3 -pthread -sFETCH -lwebsocket.js --pre-js ${CMAKE_CURRENT_SOURCE_DIR}/atomvm.pre.js -sINITIAL_MEMORY=67108864 -sALLOW_MEMORY_GROWTH)

64MB initial + growable is appropriate for JIT compilation workloads that generate WASM modules at runtime.


Test Coverage Assessment

What's Covered ✅

Area Test File Lines Coverage
WASM instruction encoding jit_wasm32_asm_tests.erl 435 All instruction types, LEB128 edge cases
WASM backend codegen jit_wasm32_tests.erl 2235 Opcode compilation, register handling, control flow
Cross-arch test infra jit_tests_common.erl +217 WAT assembly validation
CI integration wasm-build.yaml +81 Build + hello_world + library tests under JIT

What's Missing ❌

These test scenarios should be added to strengthen confidence in the runtime:

  1. Multi-thread execution: Same JITted module running on 2+ schedulers/pthreads to exercise per-thread compilation
  2. Thread migration on yield: Forced reduction yield + resume on a different thread — validates saved_function_ptr label encoding survives cross-thread dispatch
  3. Trap/load/resume cycle: code_server trap → load → resume path under WASM JIT (exercises jit_trap_and_load + CodeServerResumeSignal handling)
  4. Module reload/unload loop: Repeated load/unload to detect native-code memory growth and JS handle leaks
  5. Stacktrace accuracy after function splits: Verify record_continuation_line produces correct line info in stacktraces when a continuation label is hit

Actionable Recommendations

Must-fix before merge

  1. Add native-code release hook in module_destroy() for WASM allocations
  2. Guard NULL native_pc after module_get_native_entry_point() in the dispatch loop

Should-fix soon after merge

  1. Add removeFunction() cleanup in JS cache when modules are unloaded
  2. Store lines_metadata_size in JITWasmHeader for bounds checking
  3. Add multi-thread and lifecycle integration tests

Nice-to-have

  1. Centralize label + 1 encode/decode into helper functions to reduce #ifdef spread
  2. Key JS cache by stable module ID instead of raw pointer to prevent stale cache issues

Overall Verdict

Architecture: ✅ The JIT_JUMPTABLE_IS_DATA + label continuation approach is the right design for WASM.
Correctness: ⚠️ The encoding logic is correct, but memory lifecycle and NULL-resolution paths need hardening.
Tests: ⚠️ Strong unit/codegen coverage, but runtime/lifecycle/threading tests are missing.
Build/CI: ✅ Well-structured matrix build with both JIT and non-JIT WASM targets.
Code quality: ✅ Clean commit sequence, good separation of concerns, well-commented.

@pguyot
Copy link
Copy Markdown
Collaborator Author

pguyot commented Apr 9, 2026

Just addressed the two "CRITICAL" points in AMP review, even if we don't really have module unloading for now. Also fixed an mmap leakage we had with generic unix on module destroy.

pguyot added 10 commits April 10, 2026 23:58
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Add optional backend callback `supports_tail_cache/0`

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Move `decrement_reductions_and_maybe_schedule_next` before reading
extended registers and is_atom checks in OP_APPLY and OP_APPLY_LAST.
This matches the interpreter's order and is required for WASM.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Add current_line tracking to the compiler state and
`record_continuation_line/3` to emit line info at continuation points.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
…tion

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Also reduce the size of copied / mapped data to the current architecture
code for hypothetical cases where we deploy a precompiled module with
several architectures

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants