Skip to content

erlang: wasm32 -O2 miscompilation coverage (kd-r8h7 steps 1-4)#835

Open
brandonpayton wants to merge 1 commit into
mainfrom
gascity/kd-1mr/kd-jin7-implement-wasm32-o2-miscompilation-coverage-kd-r8h7-desi
Open

erlang: wasm32 -O2 miscompilation coverage (kd-r8h7 steps 1-4)#835
brandonpayton wants to merge 1 commit into
mainfrom
gascity/kd-1mr/kd-jin7-implement-wasm32-o2-miscompilation-coverage-kd-r8h7-desi

Conversation

@brandonpayton

Copy link
Copy Markdown
Member

Implements the low-risk core (steps 1-4) of the kd-r8h7 design
(#832,
docs/plans/2026-07-02-erlang-wasm32-o2-miscompilation-coverage-design.md):
turn the reactive per-file -O1 posture for LLVM wasm32 -O2 miscompilations
into a systematic, bounded, detected one — without a blanket -O1.

What & why

ERTS dodges known wasm32 -O2 miscompilations with per-file -O1 + a
global.h init patch, discovered one production incident at a time (unicode
garbage, ETS OOB, and md5-over-iolist breaking beam_asm → on-Kandelo erlc).
This adds detection + a bounded process so a new instance fails CI, not a user.

  • Step 1 — Audit. ESTACK/WSTACK/EQUEUE/DMC risk audit of OTP 28.2 ERTS
    (test-runs/kd-jin7/audit-*). Corrects the design's guessed list: refutes
    erl_bif_binary.c/erl_bif_re.c (no idiom), adds erl_term_hashing.c
    (phash), erl_iolist.c/erl_io_queue.c (EQUEUE iodata); flags that the
    facet-1 global.h init patch covers only ESTACK/WSTACK, not EQUEUE/DMC.
  • Step 2 — Registry. packages/registry/erlang/wasm32-miscompilations.md:
    greppable source of truth — triage runbook, applied-workaround table (chksum
    marked PR erlang: fix erlang:md5/1 badarg on iolist input (unblocks on-Kandelo compilation) #824), detection-only table, CI wiring, removal checklist, OTP-bump
    re-audit trigger. Referenced from build-erlang.sh. Doc/comment-only → no
    build output byte change → build.toml revision intentionally not bumped.
  • Step 3 — Detection matrix + local runner.
    test/wasm32-miscompilation-matrix.ts (single {name, expr, expected} source,
    native-OTP-28 oracles, inputs sized past DEF_*_SIZE == 16 to force the
    heap-stack path); test/erlang.test.ts runs the whole matrix in one BEAM boot
    under skipIf.
  • Step 4 — CI gate runner. test/run-wasm32-miscompilation-smoke.ts: one
    boot; fails on mismatch, on incompletion, and on the false-coverage case (no
    active case ran = missing OTP tree — the design's highest-listed risk). Emits
    passed/failed/skipped outcome lists.

Verification (native OTP 28 — the oracle source)

A from-source erlang.wasm build was not run (erlang is CI-disabled, "too slow"
to rebuild); the matrix was validated on native OTP 28 via dev-shell, exactly
where the design says oracles come from:

  • Positive: 8/8 active pass, 2 pending skip, matrix_done, gate exit 0.
  • Negative control: a corrupted oracle → FAIL … expected=… got=…, non-zero.
  • False-coverage guard: no OTP tree → GATE FAIL, exit 1.
  • md5/crc32/adler32 oracle cross-checked byte-for-byte vs Python
    hashlib/zlib. All 3 TS files pass esbuild transform.

Full write-up + outcome lists: test-runs/kd-jin7/SUMMARY.md.

Scope / follow-ups

  • chksum_iolist + compile_module are pending PR erlang: fix erlang:md5/1 badarg on iolist input (unblocks on-Kandelo compilation) #824 (kd-qe2c) — reported
    as expected skips until that -O1 lands (flip pendingPr in the same change).
  • PR-gate promotion is documented, not landed: the bottle-build/smoke job +
    framebuffer gate (kd-ivdr/kd-jg94) are not on main and no current CI job has
    the OTP runtime tree, so a hard gate today would trip the false-coverage guard.
    The verified runner + exact wiring block are ready in the registry.
  • Deferred design steps 5-7 (coverage closure, perf baseline, upstream LLVM) stay
    future beads.

🤖 Generated with Claude Code

Convert the reactive per-file -O1 posture for LLVM wasm32 -O2
miscompilations into a systematic, bounded, detected one. Implements the
low-risk core (steps 1-4) of the kd-r8h7 design.

Step 1 (audit): ESTACK/WSTACK/EQUEUE/DMC risk audit of OTP 28.2 ERTS.
Corrects the design's guessed list — refutes erl_bif_binary.c/erl_bif_re.c
(no idiom), adds erl_term_hashing.c (phash), erl_iolist.c/erl_io_queue.c
(EQUEUE iodata) — and surfaces that the facet-1 global.h init patch covers
only ESTACK/WSTACK, not EQUEUE/DMC. Artifacts under test-runs/kd-jin7/.

Step 2 (registry): packages/registry/erlang/wasm32-miscompilations.md — the
greppable source of truth (triage runbook, applied-workaround table with
chksum marked PR #824, detection-only table, CI wiring, removal checklist,
OTP-bump re-audit trigger). Referenced from build-erlang.sh at both
workaround sites. Doc/comment-only: no build output byte change, so
build.toml revision is intentionally NOT bumped.

Step 3 (detection matrix + local runner): test/wasm32-miscompilation-matrix.ts
is the single {name, expr, expected} source with native-OTP-28 oracles and
inputs sized past DEF_*_SIZE (16) to force the heap-stack path;
test/erlang.test.ts runs the whole matrix in one BEAM boot under skipIf.

Step 4 (CI gate runner): test/run-wasm32-miscompilation-smoke.ts — one boot,
fails on mismatch, on incompletion, and on the false-coverage case (no active
case ran = missing OTP tree). Emits passed/failed/skipped outcome lists.
PR-gate promotion is documented in the registry; it is deferred because the
bottle-build/smoke job + framebuffer gate are not yet on main and erlang is
CI-disabled (see SUMMARY).

Verified on native OTP 28 (the oracle source) via dev-shell: 8/8 active pass,
2 pending skip (chksum/compile behind PR #824); negative control fails on a
corrupted oracle; false-coverage guard fails with no tree; md5/crc32/adler32
oracle cross-checked against Python hashlib/zlib.

Also documents the miscompilation class as a cross-package pitfall in
docs/porting-guide.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Phase B-1 matrix build status — pr-835-staging

ABI v16. 67 built, 0 failed, 67 total.

Package Arch Status Sha
libcurl wasm32 built b01be826
libcxx wasm32 built bd2dd5a6
libcxx wasm64 built 926cfc96
libpng wasm32 built 25e56ef0
libxml2 wasm32 built eb77d16a
libxml2 wasm64 built 9743c3c9
openssl wasm32 built 296793ae
openssl wasm64 built b53a16d9
sqlite wasm32 built 93b98c80
sqlite wasm64 built 7079a1ac
zlib wasm32 built 4543740c
zlib wasm64 built 4ccf5221
bc wasm32 built 0c54e287
bzip2 wasm32 built 614536d6
coreutils wasm32 built e1a33298
curl wasm32 built ecbc3967
dash wasm32 built 4f2caf8b
diffutils wasm32 built 34f174f8
dinit wasm32 built 97a7849b
fbdoom wasm32 built 3799bb5e
file wasm32 built 535dd689
findutils wasm32 built 4b727dbf
gawk wasm32 built 92e5184d
git wasm32 built 42b4d1cb
grep wasm32 built fd4d79fa
gzip wasm32 built 8f54d8ac
kandelo-sdk wasm32 built ba15af22
kernel wasm32 built 917959d2
less wasm32 built 7f3184f8
lsof wasm32 built 4dc5ae5b
m4 wasm32 built 5fbbe8b3
make wasm32 built 57a6d854
mariadb wasm32 built 9ed4aa94
mariadb wasm64 built eaf29411
modeset wasm32 built d9f12284
msmtpd wasm32 built 23172a3a
nano wasm32 built 763f56f8
ncurses wasm32 built a7862ee0
netcat wasm32 built 57bdd0cd
nginx wasm32 built 933bdcd3
php wasm32 built e047ea3c
posix-utils-lite wasm32 built 2aec933c
sed wasm32 built 9d958a03
spidermonkey wasm32 built ef0ac7d5
tar wasm32 built 050e7cb3
tcl wasm32 built ee9ae67a
unzip wasm32 built b176c19f
userspace wasm32 built a9e013a2
vim wasm32 built 8194ff88
wget wasm32 built 378dca44
xz wasm32 built 763b854c
zip wasm32 built 1c834314
zstd wasm32 built bcd693e5
bash wasm32 built d435c352
mariadb-test wasm32 built 74e71062
mariadb-vfs wasm32 built 85427981
mariadb-vfs wasm64 built 6c3ac528
nethack wasm32 built 5b7fc658
node wasm32 built cdad27ac
spidermonkey-node wasm32 built 6d8dc738
vim-browser-bundle wasm32 built ef862c6f
nethack-browser-bundle wasm32 built d4992e6b
rootfs wasm32 built ae3df667
shell wasm32 built 61c3bb81
lamp wasm32 built f5445d80
node-vfs wasm32 built 75f6e69a
wordpress wasm32 built 1c1bcff5

Auto-generated; replaced on each push. Raw data in the publish-status workflow artifact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant