JIT: Add support for WASM32 by pguyot · Pull Request #2260 · atomvm/AtomVM

pguyot · 2026-04-06T22:28:46Z

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

github-advanced-security · 2026-04-06T23:34:07Z

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

petermm · 2026-04-09T14:20:37Z

AMP:

PR Review: WASM32 JIT Backend for AtomVM

Reviewer: AI-assisted review
Date: 2026-04-09
Commits reviewed: 7 (2c2428f..65b7bd7)
Author: Paul Guyot <pguyot@kallisys.net>

Summary

This PR adds a complete WASM32 JIT backend to AtomVM, allowing JIT compilation when running under Emscripten (browser/Node.js). The key architectural decision is that WASM cannot use function pointer arithmetic like native backends, so a label-based continuation encoding (label + 1) replaces direct function pointers, and per-thread WASM compilation is used because Emscripten pthreads have separate wasmTables.

Commits

#	SHA	Title	Files Changed	Lines
1	`2c2428f9d`	JIT: add WASM32 support to jit_tests_common	1	+217/-49
2	`f19d34595`	JIT: add WASM32 assembler and tests	2	+902
3	`8125935e5`	JIT: allow backends to disable tail cache	1	+34/-14
4	`3dbb496b1`	JIT: reorder reduction decrement before register reads in apply	1	+14/-14
5	`54d5f5464`	JIT: record line info at continuation labels after function splits	1	+28/-9
6	`331f9cc34`	JIT: add WASM32 JIT backend, emscripten integration, and CI	29	+4802/-45
7	`65b7bd746`	emscripten: allow heap to grow, starts with 64MB	1	+1/-1

Architecture Assessment: ✅ Sound

The JIT_JUMPTABLE_IS_DATA abstraction cleanly separates the WASM model from native backends:

Native backends: continuation is a function pointer; jump table entries are executable code
WASM backend: continuation stores label + 1; jump table is data; per-thread resolution via jit_wasm_get_entry_point()

The label + 1 encoding is consistent:

Encode: (NativeContinuation)(label + 1) via JIT_CONTINUATION_FOR_LABEL macro
Decode: (int)continuation - 1 at dispatch sites
0 is reserved as "no continuation" — no off-by-one risk in normal paths

Issues Found

🔴 Critical: Native Code Memory Leak on Module Unload

compile_wasm_stream() allocates 4 objects that are never freed:

// jit_stream_wasm.c — compile_wasm_stream()
struct JITWasmHeader *header = malloc(sizeof(struct JITWasmHeader));  // (1)
header->wasm_binary = malloc(wasm_size);                              // (2)
header->lines_metadata = malloc(lines_data_size);                     // (3)
ModuleNativeEntryPoint *block = calloc(num_entries + 1, ...);         // (4)

But module_destroy() never frees any of these:

// module.c:1292-1311
COLD_FUNC void module_destroy(Module *module)
{
    free(module->labels);
    free(module->imported_funcs);
    free(module->literals_table);
    free(module->local_atoms_to_global_table);
    free(module->line_refs_offsets);
    // ... no mention of native_code
    free(module);
}

Recommendation: Add a platform hook sys_release_native_code(ModuleNativeEntryPoint) called from module_destroy() when module->native_code != NULL.

🔴 Critical: JS `addFunction` Handles Never Freed

Per-thread addFunction() calls grow the WASM table indefinitely:

// jit_stream_wasm.c — EM_JS jit_get_thread_func_ptr
cache[i] = addFunction(func, 'iiii');  // never removeFunction()'d

Module._jitCache is keyed by raw wasm_binary pointer. If C-side memory is freed and reused, stale cache entries could reference invalid WASM instances.

Recommendation: Add removeFunction() cleanup when modules are unloaded, and key the cache by a stable module ID rather than a raw pointer.

🟡 Important: NULL Function Pointer Not Guarded in Dispatch Loop

If per-thread WASM compilation fails, jit_wasm_get_entry_point() returns NULL:

// jit_stream_wasm.c:132-136
ModuleNativeEntryPoint jit_wasm_get_entry_point(const void *native_code, int label)
{
    // ...
    int fp = jit_get_thread_func_ptr(...);
    return (ModuleNativeEntryPoint) (uintptr_t) fp;  // 0 if compilation failed
}

The dispatch loop in opcodesswitch.h then calls through this NULL pointer:

// opcodesswitch.h:1770-1775 (JIT_JUMPTABLE_IS_DATA path)
int label = (int) jit_state.continuation - 1;
native_pc = module_get_native_entry_point(mod, label);
// native_pc may be NULL — no guard before call

Recommendation: Add if (UNLIKELY(native_pc == NULL)) { abort/raise_error; } after resolution.

🟡 Important: Metadata Parsing Has No Bounds Checking

Line info and cont_label_map are parsed from lines_metadata without knowing the buffer size:

// module.c:989-1009 — module_get_function_from_label (WASM path)
uint16_t lines_count = metadata[0] | (metadata[1] << 8);
const uint8_t *cont_map_ptr = metadata + 2 + lines_count * 6;
uint16_t cont_map_count = cont_map_ptr[0] | (cont_map_ptr[1] << 8);
// No size validation — malformed metadata → OOB read

Recommendation: Store lines_metadata_size in JITWasmHeader and validate offsets during parsing.

🟢 Non-Issue: `defaultatoms.def` Length Byte "Collision"

The \xA and \xB bytes are string length prefixes, not unique IDs:

X(JIT_ARMV6M_ATOM, "\xA", "jit_armv6m")   // 10 chars → \xA ✓
X(JIT_WASM32_ATOM, "\xA", "jit_wasm32")   // 10 chars → \xA ✓
X(JIT_RISCV32_ATOM, "\xB", "jit_riscv32") // 11 chars → \xB ✓
X(JIT_RISCV64_ATOM, "\xB", "jit_riscv64") // 11 chars → \xB ✓

This is correct. Atoms are distinguished by their string content, not the length byte.

🟢 Non-Issue: `ModuleFunction` Union Safety

// exportedfunction.h
struct ModuleFunction {
    struct ExportedFunction base;
    Module *target;
    union {
        ModuleNativeEntryPoint entry_point;
        int label;
    };
};

On WASM, only label is written/read. On native, only entry_point is written/read. The active member is platform-discriminated. Safe as used.

Per-Commit Review

Commit 1: `2c2428f9d` — WASM32 support in jit_tests_common

Assessment: ✅ Good

Adds WAT-based cross-validation for WASM instruction encoding using wat2wasm + wasm-objdump. Cleanly extends the existing multi-arch test infrastructure.

The wasm_code_bytes/1 helper correctly parses WASM binary module format to extract instruction bytes from the code section.

Minor: The find_wat2wasm/0 function uses os:cmd("which wat2wasm") which returns a non-empty string even on failure on some systems. Consider checking exit code.

Commit 2: `f19d34595` — WASM32 assembler and tests

Assessment: ✅ Good

jit_wasm32_asm.erl (467 lines) provides LEB128 encoding, WASM instruction emission, and module structure building. jit_wasm32_asm_tests.erl (435 lines) covers individual instruction encoding, LEB128 edge cases, and module structure.

Test coverage is solid for encoding correctness. Each instruction has a corresponding test with expected binary output cross-validated against wat2wasm.

Commit 3: `8125935e5` — Allow backends to disable tail cache

Assessment: ✅ Good

Clean abstraction. The tail_cache_find/2 and tail_cache_store/3 helpers handle disabled transparently:

-type tail_cache() :: [{tuple(), non_neg_integer()}] | disabled.

tail_cache_find(_Key, disabled) -> false;
tail_cache_find(Key, TC) -> lists:keyfind(Key, 1, TC).

tail_cache_store(_Key, _Value, disabled) -> disabled;
tail_cache_store(Key, Value, TC) -> [{Key, Value} | TC].

Backend opt-in via supports_tail_cache/0 with erlang:function_exported/3 fallback is the right pattern for optional callbacks.

Commit 4: `3dbb496b1` — Reorder reduction decrement in apply

Assessment: ✅ Correct fix

Moves decrement_reductions_and_maybe_schedule_next before register reads in OP_APPLY, OP_APPLY_LAST, and OP_CALL_FUN2. This matches the interpreter's order and is required because WASM function splitting means a yield point can split the function, so registers must not be read before the yield check.

 first_pass(<<?OP_APPLY, Rest0/binary>>, MMod, MSt0, State0) ->
     ?ASSERT_ALL_NATIVE_FREE(MSt0),
     {Arity, Rest1} = decode_literal(Rest0),
-    {MSt1, Module} = read_any_xreg(Arity, MMod, MSt0),
-    {MSt2, Function} = read_any_xreg(Arity + 1, MMod, MSt1),
     ?TRACE("OP_APPLY ~p\n", [Arity]),
-    MSt3 = verify_is_atom(Module, 0, MMod, MSt2),
-    MSt4 = verify_is_atom(Function, 0, MMod, MSt3),
-    MSt5 = MMod:decrement_reductions_and_maybe_schedule_next(MSt4),
+    MSt1 = MMod:decrement_reductions_and_maybe_schedule_next(MSt0),
+    {MSt2, Module} = read_any_xreg(Arity, MMod, MSt1),
+    {MSt3, Function} = read_any_xreg(Arity + 1, MMod, MSt2),
+    MSt4 = verify_is_atom(Module, 0, MMod, MSt3),
+    MSt5 = verify_is_atom(Function, 0, MMod, MSt4),

Commit 5: `54d5f5464` — Record line info at continuation labels

Assessment: ✅ Good

Adds current_line tracking and record_continuation_line/3 to emit line info at function-splitting continuation points. This ensures that WASM's split functions have accurate line info for stacktraces:

record_continuation_line(_MMod, _MSt, #state{current_line = undefined} = State) ->
    State;
record_continuation_line(MMod, MSt, #state{current_line = Line, line_offsets = AccLines} = State) ->
    Offset = MMod:offset(MSt),
    State#state{line_offsets = [{Line, Offset} | AccLines]}.

Applied at all call/apply/fun opcodes that create continuation points. On native backends this is a no-op since the offset is within the same function.

Commit 6: `331f9cc34` — WASM32 JIT backend + CI

Assessment: ✅ Good architecture, issues noted above

This is the main commit (~4800 lines). The Erlang backend (jit_wasm32.erl, 1866 lines) generates WebAssembly bytecode where each BEAM label compiles to a separate WASM function. The C runtime changes add JIT_JUMPTABLE_IS_DATA support throughout the dispatch loop, continuation handling, and module management.

CI configuration is thorough:

Compiles JIT tests on host with wasm32 target arch
Runs eunit tests on host (BEAM)
Builds Emscripten with and without JIT (matrix)
Runs hello_world + library tests under JIT
CodeQL only runs for non-JIT to avoid false positives

#ifdef nesting concern: The opcodesswitch.h changes add nested #ifdef JIT_JUMPTABLE_IS_DATA inside #ifndef AVM_NO_JIT inside #ifndef AVM_NO_EMU blocks. This is getting deep but is manageable for now. Consider centralizing encode/decode helpers to reduce future spread:

// Suggested helpers to reduce #ifdef repetition:
static inline NativeContinuation jit_continuation_from_label(Module *mod, int label) {
#ifdef JIT_JUMPTABLE_IS_DATA
    return (NativeContinuation)(label + 1);
#else
    return module_get_native_entry_point(mod, label);
#endif
}

static inline ModuleNativeEntryPoint jit_resolve_continuation(Module *mod, NativeContinuation cont) {
#ifdef JIT_JUMPTABLE_IS_DATA
    return module_get_native_entry_point(mod, (int)cont - 1);
#else
    return cont;
#endif
}

Commit 7: `65b7bd746` — Emscripten heap growth

Assessment: ✅ Good

Simple, necessary change for JIT workloads:

-target_link_options(AtomVM PRIVATE ${JIT_LINK_FLAGS} -sUSE_ZLIB=1 -O3 -pthread -sFETCH -lwebsocket.js --pre-js ${CMAKE_CURRENT_SOURCE_DIR}/atomvm.pre.js)
+target_link_options(AtomVM PRIVATE ${JIT_LINK_FLAGS} -sUSE_ZLIB=1 -O3 -pthread -sFETCH -lwebsocket.js --pre-js ${CMAKE_CURRENT_SOURCE_DIR}/atomvm.pre.js -sINITIAL_MEMORY=67108864 -sALLOW_MEMORY_GROWTH)

64MB initial + growable is appropriate for JIT compilation workloads that generate WASM modules at runtime.

Test Coverage Assessment

What's Covered ✅

Area	Test File	Lines	Coverage
WASM instruction encoding	`jit_wasm32_asm_tests.erl`	435	All instruction types, LEB128 edge cases
WASM backend codegen	`jit_wasm32_tests.erl`	2235	Opcode compilation, register handling, control flow
Cross-arch test infra	`jit_tests_common.erl`	+217	WAT assembly validation
CI integration	`wasm-build.yaml`	+81	Build + hello_world + library tests under JIT

What's Missing ❌

These test scenarios should be added to strengthen confidence in the runtime:

Multi-thread execution: Same JITted module running on 2+ schedulers/pthreads to exercise per-thread compilation
Thread migration on yield: Forced reduction yield + resume on a different thread — validates saved_function_ptr label encoding survives cross-thread dispatch
Trap/load/resume cycle: code_server trap → load → resume path under WASM JIT (exercises jit_trap_and_load + CodeServerResumeSignal handling)
Module reload/unload loop: Repeated load/unload to detect native-code memory growth and JS handle leaks
Stacktrace accuracy after function splits: Verify record_continuation_line produces correct line info in stacktraces when a continuation label is hit

Actionable Recommendations

Must-fix before merge

Add native-code release hook in module_destroy() for WASM allocations
Guard NULL native_pc after module_get_native_entry_point() in the dispatch loop

Should-fix soon after merge

Add removeFunction() cleanup in JS cache when modules are unloaded
Store lines_metadata_size in JITWasmHeader for bounds checking
Add multi-thread and lifecycle integration tests

Nice-to-have

Centralize label + 1 encode/decode into helper functions to reduce #ifdef spread
Key JS cache by stable module ID instead of raw pointer to prevent stale cache issues

Overall Verdict

Architecture: ✅ The JIT_JUMPTABLE_IS_DATA + label continuation approach is the right design for WASM.
Correctness: ⚠️ The encoding logic is correct, but memory lifecycle and NULL-resolution paths need hardening.
Tests: ⚠️ Strong unit/codegen coverage, but runtime/lifecycle/threading tests are missing.
Build/CI: ✅ Well-structured matrix build with both JIT and non-JIT WASM targets.
Code quality: ✅ Clean commit sequence, good separation of concerns, well-commented.

pguyot · 2026-04-09T21:17:08Z

Just addressed the two "CRITICAL" points in AMP review, even if we don't really have module unloading for now. Also fixed an mmap leakage we had with generic unix on module destroy.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Add optional backend callback `supports_tail_cache/0` Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Move `decrement_reductions_and_maybe_schedule_next` before reading extended registers and is_atom checks in OP_APPLY and OP_APPLY_LAST. This matches the interpreter's order and is required for WASM. Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Add current_line tracking to the compiler state and `record_continuation_line/3` to emit line info at continuation points. Signed-off-by: Paul Guyot <pguyot@kallisys.net>

…tion Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Also reduce the size of copied / mapped data to the current architecture code for hypothetical cases where we deploy a precompiled module with several architectures Signed-off-by: Paul Guyot <pguyot@kallisys.net>

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

pguyot force-pushed the w14/jit-wasm branch 2 times, most recently from cb181a7 to 65b7bd7 Compare April 7, 2026 19:51

pguyot force-pushed the w14/jit-wasm branch from 0c0164c to 6933242 Compare April 9, 2026 21:15

pguyot force-pushed the w14/jit-wasm branch from 6933242 to ee255ea Compare April 10, 2026 05:15

pguyot added 10 commits April 10, 2026 23:58

emscripten: fix function declarations

a04b73d

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: add WASM32 support to jit_tests_common

6778713

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: add WASM32 assembler and tests

540de05

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: allow backends to disable tail cache

f46f02d

Add optional backend callback `supports_tail_cache/0` Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: record line info at continuation labels after function splits

e8410bf

Add current_line tracking to the compiler state and `record_continuation_line/3` to emit line info at continuation points. Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: add WASM32 JIT backend, emscripten integration, and CI configura…

283bca7

…tion Signed-off-by: Paul Guyot <pguyot@kallisys.net>

emscripten: allow heap to grow, starts with 64MB

a0c5dd7

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT: free generic_unix and wasm jit resources on module destroy

57ff1e4

Also reduce the size of copied / mapped data to the current architecture code for hypothetical cases where we deploy a precompiled module with several architectures Signed-off-by: Paul Guyot <pguyot@kallisys.net>

JIT WASM32: call removeFunction on release of modules

9505890

Signed-off-by: Paul Guyot <pguyot@kallisys.net>

pguyot force-pushed the w14/jit-wasm branch from ee255ea to 9505890 Compare April 10, 2026 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Add support for WASM32#2260

JIT: Add support for WASM32#2260
pguyot wants to merge 10 commits intoatomvm:release-0.7from
pguyot:w14/jit-wasm

pguyot commented Apr 6, 2026

Uh oh!

github-advanced-security bot commented Apr 6, 2026

Uh oh!

petermm commented Apr 9, 2026

Uh oh!

pguyot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pguyot commented Apr 6, 2026

Uh oh!

github-advanced-security bot commented Apr 6, 2026

What Enabling Code Scanning Means:

Uh oh!

petermm commented Apr 9, 2026

PR Review: WASM32 JIT Backend for AtomVM

Summary

Commits

Architecture Assessment: ✅ Sound

Issues Found

🔴 Critical: Native Code Memory Leak on Module Unload

🔴 Critical: JS addFunction Handles Never Freed

🟡 Important: NULL Function Pointer Not Guarded in Dispatch Loop

🟡 Important: Metadata Parsing Has No Bounds Checking

🟢 Non-Issue: defaultatoms.def Length Byte "Collision"

🟢 Non-Issue: ModuleFunction Union Safety

Per-Commit Review

Commit 1: 2c2428f9d — WASM32 support in jit_tests_common

Commit 2: f19d34595 — WASM32 assembler and tests

Commit 3: 8125935e5 — Allow backends to disable tail cache

Commit 4: 3dbb496b1 — Reorder reduction decrement in apply

Commit 5: 54d5f5464 — Record line info at continuation labels

Commit 6: 331f9cc34 — WASM32 JIT backend + CI

Commit 7: 65b7bd746 — Emscripten heap growth

Test Coverage Assessment

What's Covered ✅

What's Missing ❌

Actionable Recommendations

Must-fix before merge

Should-fix soon after merge

Nice-to-have

Overall Verdict

Uh oh!

pguyot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🔴 Critical: JS `addFunction` Handles Never Freed

🟢 Non-Issue: `defaultatoms.def` Length Byte "Collision"

🟢 Non-Issue: `ModuleFunction` Union Safety

Commit 1: `2c2428f9d` — WASM32 support in jit_tests_common

Commit 2: `f19d34595` — WASM32 assembler and tests

Commit 3: `8125935e5` — Allow backends to disable tail cache

Commit 4: `3dbb496b1` — Reorder reduction decrement in apply

Commit 5: `54d5f5464` — Record line info at continuation labels

Commit 6: `331f9cc34` — WASM32 JIT backend + CI

Commit 7: `65b7bd746` — Emscripten heap growth

pguyot commented Apr 9, 2026 •

edited

Loading