Skip to content

docs(fork-instrument): evaluate instrumentation size overhead (kd-ok55)#834

Draft
brandonpayton wants to merge 1 commit into
mainfrom
gascity/kd-1mr/kd-ok55-evaluate-fork-instrumentation-size-overhead-42-of-perl.w
Draft

docs(fork-instrument): evaluate instrumentation size overhead (kd-ok55)#834
brandonpayton wants to merge 1 commit into
mainfrom
gascity/kd-1mr/kd-ok55-evaluate-fork-instrumentation-size-overhead-42-of-perl.w

Conversation

@brandonpayton

Copy link
Copy Markdown
Member

Design/evaluation for kd-ok55 (discovered from kd-lfas). No emitter change — this is a measurement + written assessment + do/defer/won't-fix recommendation for @brandon's sizing/prioritization. Draft: for review/decision, not merge.

Doc: docs/plans/2026-07-02-fork-instrument-size-overhead-eval.md. Reproducible data under test-runs/kd-ok55/.

What was measured

  • Selection ratio (measured, 7 real shipped runtimes): the fork-path reverse-reachability closure instruments 69–87% of ALL functions — bash 80.3%, coreutils 69.0%, vim 78.9%, git 87.1%, ruby 86.8%, php/php-fpm 77.6%. This, not stub inefficiency, is the primary size driver, and it's far above the design's own "well-scoped ⇒ +2–5%" expectation.
  • Clean code-section delta (single-step, real ruby): +166.7% code (perl reference +127%). Overhead is ~100% Code section; instrumentation roughly doubles the code.
  • Cost model (A/B measured, C analytical): bytes(f) ≈ A + B·callsites + C·locals, A≈110 B fixed, B≈27 B/call-site; the per-scalar-local term C is the largest per-function term by a byte-level read of the emitter but is unmeasured (synthetics have no locals) — hence the --stats recommendation.

Recommendation

  • DEFER a dedicated size project. The wins are multi-MB on php/ruby (help the browser host), but every emitter change is gated by the ≥10k-iter fork fuzz gate + POSIX/libc suites — a rewind bug corrupts forked children silently.
  • DO now: add a --stats mode (no emitter change) — a prerequisite for adjudicating the per-function size case.
  • If prioritized: frame-ptr CSE (ABI-neutral, ~15–20%) → fork-live-only local spill → tighter call_indirect selection. Not the trampoline first.
  • Won't-fix-for-size: reviving the scaffolded table-driven "trampoline" — it targets the minor dispatch term while adding a REWIND call_indirect + per-function tables.

Key claims independently adversarially reviewed; the apportionment was softened to a measured-vs-inferred split accordingly.

🤖 Generated with Claude Code

Measurement + assessment of fork-instrumentation code-size overhead with a
do/defer/won't-fix recommendation. The fork-path reverse-reachability closure
instruments 69-87% of ALL functions across 7 real runtimes (bash, coreutils,
vim, git, ruby, php, php-fpm) and ~doubles the code section (perl +127%,
ruby +167% clean single-step). Per-function cost is dominated (analytically;
unmeasured) by per-scalar-local spill, a hard inline floor under wasm's
static-local-index rule.

Recommendation: DEFER a dedicated size project; DO a safe --stats enabler
(prerequisite for adjudicating the size case); if prioritized, stage
frame-ptr CSE (ABI-neutral ~15-20%) -> fork-live-only local spill ->
selection tightening. Won't-fix the scaffolded table-driven trampoline for
size (targets the minor dispatch term). Design/eval only; no emitter change.

Artifacts + reproducible harness under test-runs/kd-ok55/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant