This document defines the path to bring NullHub's test discipline closer to NullClaw's while keeping each improvement shippable in small, isolated pull requests.
The aim is not a single large testing rewrite. The aim is to improve confidence incrementally, with each PR standing on its own wherever possible.
- make the existing backend test suite a reliable daily gate
- expand coverage into the highest-risk backend areas
- expand frontend unit coverage beyond the current targeted UI helper tests
- replace shell-only smoke reliance with structured integration coverage
- keep browser E2E small and focused
- adopt NullClaw-style expectations: every behavior change gets tests, every bug fix gets a regression test
As of the current main branch:
- NullHub already has substantial Zig unit-test coverage in parts of the backend.
- Coverage is concentrated heavily in API and routing code.
- The project has a shell smoke script at
tests/test_e2e.sh. - The project has a targeted Mission Control UI helper test; broader component and route-level frontend coverage is still light.
- CI currently runs backend tests, the shell smoke test on Linux, and ReleaseSmall binary builds.
This means the main gap is not "no tests". The gap is uneven coverage and missing layers.
NullHub should follow the same core discipline used by NullClaw.
- Every code change must be accompanied by tests.
- Every bug fix must include a regression test.
- If a path is impractical to unit test, document why.
- Keep tests as close as possible to the behavior they validate.
- Prefer the smallest test that proves the contract.
- Add test helpers only when they unlock repeated future coverage.
- Keep fast tests fast; separate unit, integration, smoke, and browser E2E concerns.
The snapshot below is based on the current src/ tree and the committed test distribution.
| Area | Current assessment | Evidence in tree | Highest-value next work |
|---|---|---|---|
| API routing and instance endpoints | Strong | src/api/instances.zig, src/server.zig, src/api/* contain the densest test coverage |
expand cross-module integration coverage instead of adding more narrow route parsing tests |
| Installer | Medium | src/installer/orchestrator.zig, registry.zig, downloader.zig, ui_modules.zig, builder.zig |
add rollback, partial-failure cleanup, and fixture-driven install/update scenarios |
| Supervisor and process lifecycle | Medium | src/supervisor/manager.zig, process.zig, health.zig, runtime_state.zig |
add restart/backoff, boot reconciliation, and deterministic lifecycle integration tests |
| Config, state, and paths | Medium | src/core/state.zig, src/api/config.zig, src/core/paths.zig |
add tests around persisted-state restoration and migration-sensitive behavior |
| Auth and access control | Light | src/auth.zig, src/access.zig |
add unauthorized origin, token failure, and sensitive-route boundary tests |
| Service install/uninstall/status | Light | src/service.zig |
add stronger platform-specific generation and failure-path tests |
| Product proxies | Light | src/api/nullboiler.zig, src/api/nulltickets.zig, src/api/nullwatch.zig |
add upstream error mapping and token/header forwarding tests |
| Discovery, mDNS, and compat layers | Light | src/discovery.zig, src/mdns.zig, src/compat/* |
add degraded-mode and missing-tool fallback coverage |
| Frontend UI logic | Light | ui/src/lib/missionControl/replayAutomation.test.mjs covers the replay automation helper |
add broader component and route-level coverage |
| Structured backend integration tests | Light | shell smoke only in tests/test_e2e.sh |
add a real HTTP/integration harness with fixtures |
| Browser end-to-end | Missing | no Playwright or equivalent suite | add a very small critical-flow suite after UI unit tests land |
The current backend suite is broad in file count but uneven in depth.
Files that sit near the high end of the current distribution include:
src/api/instances.zigsrc/server.zigsrc/api/providers.zigsrc/core/state.zigsrc/cli.zigsrc/api/wizard.zigsrc/api/logs.zigsrc/installer/orchestrator.zigsrc/supervisor/manager.zigsrc/api/config.zig
Refresh this snapshot with:
rg -n --glob '*.zig' '^test\s+"' src | awk -F: '{count[$1]++} END {for (f in count) print count[f], f}' | sort -nrNullHub should converge on four layers.
Use for:
- parsing and normalization
- route matching
- config and state transforms
- installer decision logic
- supervisor state transitions
- auth and access rules
Primary local command:
zig build test -Dembed-ui=false -Dbuild-ui=false --summary allThis backend-only test entrypoint does not require prebuilt UI assets.
Use for:
- HTTP route behavior across modules
- boot and runtime lifecycle flows
- managed-instance interactions
- product proxy behavior with fake upstreams
- installer and update scenarios using fixtures
These should not require a browser.
Use for:
- API client helpers
- stores and route transforms
- form validation and state behavior
- NullBoiler helper and key UI components
Recommended tooling:
vitest@testing-library/svelte
Use for:
- route loading and hydration sanity
- critical user flows
- embedded asset/runtime integration
Recommended tooling:
- Playwright
Keep this layer intentionally small.
Every testing PR should follow this pattern unless it is documentation-only.
- Pick one behavior, contract, or regression.
- Add a failing test that expresses the expected behavior.
- Make the smallest code change that makes the test pass.
- Run the smallest relevant validation first.
- Run the broader project gate before opening the PR.
- Document anything skipped.
For bug fixes, prefer explicit regression naming or a short regression comment.
The sequence below is designed for clean, isolated PRs.
Purpose:
- document the test contract
- align contributor expectations with NullClaw's model
Status:
- covered by this document
Dependencies:
- none
Purpose:
- make the shell smoke test fail on real server crashes
- keep smoke runs isolated from developer-local state
Landed scope:
test(smoke): harden e2e server diagnostics
Status:
- already landed on
mainintests/test_e2e.sh; do not open a duplicate smoke-hardening PR unless new smoke gaps are identified
Dependencies:
- none
Purpose:
- make current strengths and weaknesses explicit
- give later test PRs a scoped target list
Status:
- covered by this document
Dependencies:
- none
Purpose:
- make backend tests the undisputed daily gate
- reduce confusion around UI asset coupling during test runs
Suggested PR:
build(test): make backend test entrypoint deterministic and documented
Dependencies:
- none
Purpose:
- make installer, supervisor, and product proxy tests cheaper to write
Suggested PR:
test(fixtures): add reusable backend test helpers for state and upstream fakes
Dependencies:
- Phase 3 preferred
Target order:
- supervisor and process lifecycle
- installer and updates
- auth and access control
- product proxy behavior
- service generation and status behavior
- discovery and degraded-mode fallbacks
Example PRs:
test(supervisor): cover restart threshold and crash recovery transitionstest(installer): cover rollback and duplicate-instance failure pathstest(auth): cover unauthorized origin and bearer-token failure pathstest(product-proxies): cover upstream error mapping and token forwardingtest(service): cover launchd/systemd generation and failure paths
Dependencies:
- Phase 4 recommended for several of these areas
Purpose:
- stop relying on a shell script as the only assembled-behavior check
Suggested PRs:
test(integration): add structured HTTP smoke harnesstest(integration): cover instance lifecycle and config mutation flowstest(integration): cover product proxy scenarios
Dependencies:
- Phase 4 strongly recommended
Purpose:
- expand the UI logic test layer
Suggested PRs:
test(ui): add component-level Svelte test coveragetest(ui): cover API client and config-form helperstest(ui): cover NullBoiler helpers and key components
Dependencies:
- none
Purpose:
- catch browser-only regressions without growing a large flaky suite
Suggested PRs:
test(e2e): add Playwright harness and dashboard smoke flowtest(e2e): cover instances and settings journeystest(e2e): cover wizard happy path
Dependencies:
- Phase 7 recommended
Purpose:
- make testing discipline the default workflow rather than tribal knowledge
Suggested PRs:
ci(test): split backend, smoke, and release jobshooks(test): add pre-push backend test enforcementci(ui): add frontend unit and browser E2E jobs
Dependencies:
- depends on the corresponding earlier phases for any enforced suites
Purpose:
- make gaps visible without optimizing for vanity percentages too early
Suggested PR:
ci(coverage): publish test suite summary and UI coverage artifacts
Dependencies:
- frontend harness in place first
Docs-only changes:
git diff --checkBackend code changes:
zig build test -Dembed-ui=false -Dbuild-ui=false --summary allSmoke or lifecycle changes:
zig build test -Dembed-ui=false -Dbuild-ui=false --summary all
bash tests/test_e2e.shFrontend logic changes:
npm --prefix ui test -- --run
zig build test -Dembed-ui=false -Dbuild-ui=false --summary allIf any validation is skipped, the PR description should say exactly what was skipped and why.
NullHub should be considered aligned with NullClaw's testing model when all of the following are true:
- contributor docs require tests for every code change
- backend tests are reliable and treated as the primary local gate
- high-risk backend subsystems have direct failure-mode coverage
- structured backend integration tests exist beyond shell-only smoke
- frontend unit tests run locally and in CI
- a minimal browser E2E suite covers critical user journeys
- CI and hooks reinforce the workflow