Skip to content

fix(keepkey): validate Features version fields in initialize() before semver#37

Merged
BitHighlander merged 2 commits into
masterfrom
fix/keepkey-initialize-validate-version
Apr 8, 2026
Merged

fix(keepkey): validate Features version fields in initialize() before semver#37
BitHighlander merged 2 commits into
masterfrom
fix/keepkey-initialize-validate-version

Conversation

@BitHighlander
Copy link
Copy Markdown
Collaborator

Summary

KeepKeyHDWallet.initialize() constructs the firmware version string from the protobuf Features response and immediately passes it to semver.gte() to populate the coin-support feature flags:

const fwVersion = `v${out.majorVersion}.${out.minorVersion}.${out.patchVersion}`;
this._supportsOsmosis = semver.gte(fwVersion, "v7.7.0");

If any of majorVersion / minorVersion / patchVersion are undefined (wrong-message-type miscast, decode mismatch, device in an unexpected state, etc.), the code happily produces:

vundefined.undefined.undefined

…and semver.gte() then throws the opaque error:

Invalid Version: vundefined.undefined.undefined

This bubbles up through pairRawDevice() with no actionable diagnostic, which has been observed leaking from the WebUSB pair path in KeepKey Vault on Windows as a silent splash hang on first launch — there's no clue in the engine log that the failure is a malformed Features response.

Fix

Add a guard in initialize() that validates the three version fields are numbers before constructing fwVersion. On failure, throw a clear runtime error that names the missing fields:

KeepKey Initialize returned Features without firmware version (major=undefined, minor=undefined, patch=undefined). Device may be in bootloader mode, mid-update, or returned an unexpected message type.

Consumers (KeepKey Vault engine controller, etc.) can now recognize this failure mode and surface a meaningful UI state instead of crashing or hanging.

Tests

New file packages/hdwallet-keepkey/src/keepkey-initialize.test.ts with 4 tests:

  1. Throws clear error when Features has no version fields
  2. Throws when only majorVersion is missing
  3. Does not throw the validation error when all version fields are present
  4. Regression: never produces Invalid Version: vundefined.undefined.undefined

The tests use a hand-rolled mock transport (call, getDeviceID, keyring.addAlias, debugLink) — no USB hardware required.

Verified locally

Step Result
yarn build (tsc --build) clean
New tests against unfixed code 3 fail with "Invalid Version: vundefined.undefined.undefined" (exact match for the original bug string)
New tests against fixed code 4/4 pass
Full hdwallet-keepkey suite 13/13 pass (9 baseline + 4 new)
yarn lint:ts (tsc --noEmit) clean

Scope

Strictly defensive — no behavioral change for any code path where the device returns a properly populated Features. The fix only affects the previously-undiagnosable vundefined.undefined.undefined failure mode.

Test plan

  • Unit tests pass (yarn jest --testPathPattern keepkey-initialize)
  • Existing tests still pass (yarn jest --testPathPattern hdwallet-keepkey)
  • TypeScript typecheck clean (yarn lint:ts)
  • Reviewer: skim the new error message wording for consistency with project tone

… semver

Without this guard, a Features payload missing major/minor/patch
(wrong-message-type miscast, decode mismatch, or device in an
unexpected state) caused initialize() to construct
`vundefined.undefined.undefined` and pass it to semver.gte(), which
threw the opaque error:

    Invalid Version: vundefined.undefined.undefined

This error has been observed leaking from the WebUSB pair path in
KeepKey Vault on Windows with no actionable diagnostic in the engine
log. Replace it with a clear runtime error that names the missing
fields so callers can recognize the failure mode and surface a
useful UI state instead of a silent splash hang.

Includes a regression test (keepkey-initialize.test.ts) that
reproduces the exact original error string against the unfixed code
and verifies the new validation throws before semver is reached.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hdwallet-sandbox Ready Ready Preview, Comment Apr 8, 2026 10:38pm

Request Review

- Capture rejection once instead of double initialize() call
- Happy-path test now asserts resolve + verifies returned features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@BitHighlander BitHighlander merged commit c0c4c73 into master Apr 8, 2026
3 of 4 checks passed
BitHighlander pushed a commit to keepkey/keepkey-vault that referenced this pull request Apr 8, 2026
Captures the diagnostic work from a 1.2.14 Windows install session for
the next agent to act on. Three distinct failure modes, all surfacing
to the user as the same "splash never advances" symptom but caused by
unrelated bugs in different layers.

1. `Invalid Version: vundefined.undefined.undefined` on WebUSB pair —
   defensive fix in flight at keepkey/hdwallet#37 (with regression
   tests). Bumping the submodule pointer here is a follow-up after
   that PR merges.

2. Native crash on USB unplug during in-flight WebUSB write — bun
   process disappears with no JS error after libusb's `bad write`.
   Same family as the existing macOS attach-time SIGTRAP guard in
   engine-controller.ts:251-254. Open; suggested approach in the doc
   ranges from a process-supervisor band-aid to moving USB I/O into a
   child process.

3. Splash hangs forever when port 1646 is already bound — the FATAL
   exit message is in the backend log but the BrowserWindow is created
   before the port collision check, leaving an undecorated splash on
   screen. Open; fix is to move the collision probe before window
   creation and/or surface a "Vault is already running" UI state.

Doc includes log evidence for each finding, root-cause file pointers,
reproduction recipes, and an explicit "what is NOT broken" section to
prevent the next agent from re-investigating the verified-intact
download, the 1.2.14 @noble/hashes fix, the protobuf schema, and the
frontend bundle.
BitHighlander pushed a commit to keepkey/keepkey-vault that referenced this pull request Apr 9, 2026
…ation

THE BUG
=======

projects/keepkey-vault/src/bun/index.ts:4246 unconditionally spawned
`bash -c '<heartbeat watchdog script>'` at module load. On Win10, when
the app is launched from Explorer / Start Menu / installer Run-now,
the bun worker inherits an empty PATH from the parent process.
Bun.spawn delegates to libuv, which fails with UV_ENOENT (-4058) when
it can't locate `bash`, and the failure is delivered as an UNCAUGHT
asynchronous exception in the worker thread several seconds after the
spawn call site. The exception kills the entire app right around the
device pair flow, and the user sees a splash that hangs and then
disappears.

The watchdog is POSIX-only — its script uses bash, kill -9, date +%s,
sleep, cat — and could never have functioned on Windows even if `bash`
were on PATH. It exists to defend against an FFI freeze in
kkemu confirm_helper that only happens on macOS/Linux builds.

This was the dominant Win10 1.2.14 first-launch crash. Reproduced
twice across two different rebuilt 1.2.14 binaries (4f8ec1ba… and
2111ad61…), and verified fixed in-place by patching the installed
bundle and re-launching from the desktop icon.

THE FIX
=======

1. startHeartbeatWatchdog() returns early on process.platform === 'win32'
   with a [Vault] Heartbeat watchdog skipped on Windows log line.
2. The Bun.spawn(['bash', ...]) call is wrapped in try/catch as
   defense-in-depth on POSIX hosts where bash could conceivably be
   missing (containers, minimal NixOS, etc).
3. A long comment block above the function explains the platform
   constraint and references this incident, so the next person to
   touch the watchdog doesn't accidentally re-enable it on Windows.

OBSERVABILITY (the only reason this was diagnosable)
====================================================

The previous logger used fs.createWriteStream(LOG_FILE, {flags:'a'})
whose buffered .write() calls never reached disk on a worker-thread
crash. Every failed launch left a vault-backend.log that "ended" 2-7
seconds before the actual death point, and we spent two days chasing
wrong root causes (libusb segfault, semver throw, port collision)
because the actual exception line never made it to disk.

This commit replaces the buffered stream with fs.appendFileSync per
log call. Throughput hit is negligible at our log volume (~10-100
lines/sec at peak boot); the upside is that the log file is now a
faithful record of what executed up to a crash.

It also adds a structured boot environment dump immediately after
[Boot] Log file: …, capturing platform, arch, pid, ppid, cwd, argv,
stdio TTY status, PATH.length, LANG, LC_ALL, and Windows-only env
vars (USERNAME, SESSIONNAME, APPDATA, LOCALAPPDATA). The PATH.length
field is what surfaced this bug — Explorer launches show
PATH.length=0, terminal launches show PATH.length=882. Without that
single field, the only evidence was "splash hangs" which is
consistent with twenty different root causes.

Finally, engine-controller.ts gets boundary log lines around every
JS↔native transition in start() and fetchFirmwareManifest() —
usb.on(attach), usb.on(detach), usb.getDeviceList(), mergeManifests,
applyChannel, syncState — each wrapped in try/catch with an explicit
[Engine] FATAL: log so a future libusb segfault leaves a clear
breadcrumb instead of a silent process exit.

WHAT THIS DOES NOT FIX
======================

The three findings from docs/HANDOFF-1.2.14-WINDOWS-PAIR.md remain:

- Finding 1 (Invalid Version: vundefined.undefined.undefined) is
  addressed by keepkey/hdwallet#37 which still needs review/merge,
  followed by a submodule pointer bump here.
- Finding 2 (native crash on USB unplug during pairRawDevice) is
  separate from this PR's bug. With the sync logger landing, the
  next reproduction should leave the actual death cause in the log.
- Finding 3 (port-1646 collision splash hang) is already fixed in
  the official 1.2.14 rebuild from 2026-04-09 — verified working.

See docs/HANDOFF-1.2.14-WIN10-WATCHDOG-CRASH.md for the full
diagnosis story, including the verification recipe.
BitHighlander added a commit to keepkey/keepkey-vault that referenced this pull request Apr 9, 2026
Bump version to 1.2.15 and pin hdwallet submodule to master (includes
keepkey/hdwallet#37 — Features version validation).

Changes since v1.2.14 pre-release:
- fix(win10): skip POSIX heartbeat watchdog on Windows (#107)
- fix: sync file logger — crash logs now survive native exceptions (#107)
- fix: boot environment dump for launch-context diagnostics (#107)
- fix: JS↔native boundary logging in engine-controller (#107)
- fix: port 1646 collision check before window creation (#106)
- fix: USB detach race guard on WebUSB pair (#106)
- fix: hdwallet Features version validation (hdwallet#37)
- docs: handoff for 1.2.14 Windows pair failures (#105)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant