test(usage): unique SQLite path per test (removes the cross-test race) by wranngle · Pull Request #218 · wranngle/gtm_ops

wranngle · 2026-05-29T01:07:53Z

Summary

`usage.test.ts` shared a single `TEST_DB_PATH` (`config/usage_test.db`) across every test and relied on `beforeEach`/`afterEach` `unlinkSync` to keep them isolated. Fragile: sqlite3's `Database.close()` returns before the OS file handle is fully released, so a prior test's connection could still be holding the file when the next test ran its unlink + opened a new connection on the same path. That's the recurring `usage.test.ts` intermittent failure I've been working around on basically every PR's CI this session.

Each test now gets its own unique path under `os.tmpdir()` — no shared state, no race possible from that source. `afterEach` unlink is best-effort cleanup; not load-bearing for correctness anymore.

Honest scope

This clears the obvious cross-test race class but a ~25% intermittent failure rate persists even in isolation, with assertions like `expected 0.045 to be less than 0.045` — meaning a `PIPELINE_STARTED` row count is occasionally short by one despite all `trackEvent` awaits resolving. That's a deeper async-visibility issue inside `lib/usage.ts` (sqlite3 callback firing before the row is queryable, or a subtler load-induced race) — invasive to fix and out of scope for a test-hygiene PR. Tracking for follow-up.

The CRLF→LF noise in the diff (~742 lines changed for ~30 functional) is the `.gitattributes` normalization (added in #208) catching up on a file that's been CRLF since before that PR.

Validation

`bun run typecheck` clean
`bun run typecheck:functions` clean
`bun run typecheck:console` clean
usage.test.ts: 10/10 isolated runs pass cleanly (residual ~25% appears under sustained pressure, not single-CI-run)

🤖 Generated with Claude Code

usage.test.ts shared a single TEST_DB_PATH (config/usage_test.db) across every test and relied on beforeEach `unlinkSync` + afterEach `unlinkSync` to keep them isolated. That was fragile: sqlite3's `Database.close()` returns before the OS file handle is fully released, so a prior test's connection could still be holding the file when the next test ran its unlink + opened a new connection on the same path. Result was the "usage.test.ts" intermittent failure I've been working around on basically every PR's CI this session. Each test now gets its own unique path under `os.tmpdir()` (`gtm-ops-usage-test-${pid}-${Date.now()}-${rand}.db`). No shared state, no race possible from that source. afterEach unlink is best-effort cleanup; not load-bearing for correctness. Honest scope note: this clears the obvious cross-test race class but a ~25% intermittent failure rate persists even in isolation, with assertions like `expected 0.045 to be less than 0.045` — meaning a PIPELINE_STARTED row count is occasionally short by one despite all trackEvent awaits resolving. That's a deeper async-visibility issue inside `lib/usage.ts`'s tracker (sqlite3 callback firing before the row is queryable, or some subtler load-induced race) — invasive to fix and out of scope for a test-hygiene PR. Tracked for follow-up. Validation: 10/10 isolated runs pass cleanly with the unique-path fix (the residual ~25% appears under sustained pressure, not the kind that fires in a single CI run). `typecheck` + `typecheck:functions` + `typecheck:console` all stay clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ls + tier marketing Direct user critique: the prior rounds caught structural noise but missed the actual prose. Read pages-1 + pages-2 line-by-line and cut the caption-under-the-name pattern and the tutorial-paragraph-in-the-popout pattern wherever they appeared. Cut: - Integrations row mono caption ({c.sub}). "#gtm-ops · 14 channels watched", "warehouse: HELIX_GTM · read+write · ANALYST role", "sequence sync paused · 502 retry 3/5" — all operational metadata the operator cannot act on at row level. The status badge + the Configure/Connect button carry triage signal; the detail belongs in the popout. (The example flagged in review.) - Integrations popout — {activeConfig.what} muted description below the title ("Operator alerting and command surface for run failures…"). User opened the popout; they know what they opened. - Integrations popout — "actions permitted (X/Y)" parenthetical count. The checkboxes below are visible. - Pipeline header sub — trailing tutorial ". Drag a card across stage columns to advance the deal." Cards are draggable; the column layout is the affordance. - Hot leads row — {industry} · {size} ppl mono caption under name. Hot leads triage on intent + score. Industry/size lives in the drawer when an operator opens the lead. - Hot leads row — "{score}/100" denominator on the score caption. The progress bar already shows it's a percentage; the bare number is enough next to the bar. - Tier grid (Billing → Change plan) — "Tiers mirror wranngle.com. Annual saves 17% vs monthly." marketing line. The monthly/annual segmented control + the prices already show the difference. - Eval evidence popout — 2-sentence muted ("Evidence is loaded inside the console first. The local source reference is review metadata; the raw payload stays below as supporting detail."). - Eval "new suite" popout — 2-sentence muted tutorial ("Create the suite inside the console first. It becomes a draft row…"). - Eval "manifest command handoff" popout — 2-sentence muted tutorial ("Run plan loaded from the console manifest…"). - Calls coaching mode pane — muted tutorial ("Transcript rows now create coaching notes against {co}. Flagged lines are prioritized for the sales coach context."). The kicker + title already say coaching is armed. Verification: - bun run typecheck:console — clean - bun run test:run — 2858/2859 pass (1 pre-existing usage.test.ts race, tracked by open PR #218; passes in isolation) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ls + tier marketing (#221) ## What Direct critique from review: the prior rounds caught structural noise (PageHeaders, demo-pill, redundant kickers, the "covered" badge) but missed the actual **prose** — the operational captions sitting under row names, the tutorial paragraphs sitting inside popouts, the marketing line at the top of the tier grid. Read `pages-1.tsx` + `pages-2.tsx` line-by-line and cut the caption-under-the-name and tutorial-paragraph-in-the-popout patterns wherever they appeared. ## Cuts | Site | What | | --- | --- | | **Integrations row** | `{c.sub}` mono caption — "#gtm-ops · 14 channels watched", "warehouse: HELIX_GTM · read+write · ANALYST role", "sequence sync paused · 502 retry 3/5". Operational metadata the operator cannot act on at row level. Status badge + Configure/Connect button carry the triage signal; detail belongs in the popout. (The example flagged in review.) | | **Integrations popout** | `{activeConfig.what}` muted description ("Operator alerting and command surface for run failures…"). User opened the popout to configure; they don't need the marketing. | | **Integrations popout** | "actions permitted (X/Y)" parenthetical count. The checkboxes below are visible. | | **Pipeline header sub** | Trailing tutorial ". Drag a card across stage columns to advance the deal." Cards are draggable; the column layout is the affordance. | | **Hot leads row** | `{industry} · {size} ppl` mono caption under name. Hot leads triage on intent + score; industry/size belongs in the drawer. | | **Hot leads row** | `{score}/100` denominator on the score caption. The progress bar already shows it's a percentage. | | **Billing → Change plan** | "Tiers mirror wranngle.com. Annual saves 17% vs monthly." marketing line. The monthly/annual segmented toggle + prices already show the difference. | | **Eval evidence popout** | "Evidence is loaded inside the console first. The local source reference is review metadata; the raw payload stays below as supporting detail." | | **Eval new-suite popout** | "Create the suite inside the console first. It becomes a draft row, then the harness run plan opens…" | | **Eval manifest popout** | "Run plan loaded from the console manifest. Outputs return here as review evidence…" | | **Calls coaching pane** | "Transcript rows now create coaching notes against {co}. Flagged lines are prioritized for the sales coach context." Kicker + title already say coaching is armed. | ## What stays (and why) - **Empty states** keep their helper text — that's exactly where the design *can't* speak for itself. - **Error messages** keep their explainer paragraph (eval-runs error: "Reason: …" / "Both /api/eval-runs and the local artifact index failed."). - **Field labels in form layouts** (`<dt>`-style `eyebrow` micro-labels: "System role", "Voice mode", "First message", "data contract", "sync", "automation") — they're the only label for a paragraph or input. - **Receipt/timestamp pills** ("recap @ 3:14", "trace prepared @ 3:18") — they encode a transient action with its timing; no other surface shows it. - **Identifying captions** under a name in densely-packed multi-context rows (agent role + surface; call participant `{c.who}`) — they discriminate between otherwise-identical rows. ## Verification - `bun run typecheck:console` — clean - `bun run test:run` — 2858/2859 pass. The one failure is `tests/unit/usage.test.ts` ("should break down costs by category", sqlite3 async race) which is a **pre-existing flake** tracked by open PR #218 and unrelated to UI. The same test passes in isolation (`bun test tests/unit/usage.test.ts` → 17/17). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Cody Arnold <cody@wranngle.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

wranngle enabled auto-merge (squash) May 29, 2026 01:07

github-actions Bot added the pr-needs-issue PR has no Closes/Fixes/Resolves reference; auto-applied by pr-link-check label May 29, 2026

wranngle mentioned this pull request May 29, 2026

ui(console): visual restraint round 3 — row captions + popout tutorials + tier marketing #221

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(usage): unique SQLite path per test (removes the cross-test race)#218

test(usage): unique SQLite path per test (removes the cross-test race)#218
wranngle wants to merge 1 commit into
mainfrom
fix/usage-test-unique-db

wranngle commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wranngle commented May 29, 2026

Summary

Honest scope

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant