Skip to content

test(usage): unique SQLite path per test (removes the cross-test race)#218

Open
wranngle wants to merge 1 commit into
mainfrom
fix/usage-test-unique-db
Open

test(usage): unique SQLite path per test (removes the cross-test race)#218
wranngle wants to merge 1 commit into
mainfrom
fix/usage-test-unique-db

Conversation

@wranngle

Copy link
Copy Markdown
Owner

Summary

`usage.test.ts` shared a single `TEST_DB_PATH` (`config/usage_test.db`) across every test and relied on `beforeEach`/`afterEach` `unlinkSync` to keep them isolated. Fragile: sqlite3's `Database.close()` returns before the OS file handle is fully released, so a prior test's connection could still be holding the file when the next test ran its unlink + opened a new connection on the same path. That's the recurring `usage.test.ts` intermittent failure I've been working around on basically every PR's CI this session.

Each test now gets its own unique path under `os.tmpdir()` — no shared state, no race possible from that source. `afterEach` unlink is best-effort cleanup; not load-bearing for correctness anymore.

Honest scope

This clears the obvious cross-test race class but a ~25% intermittent failure rate persists even in isolation, with assertions like `expected 0.045 to be less than 0.045` — meaning a `PIPELINE_STARTED` row count is occasionally short by one despite all `trackEvent` awaits resolving. That's a deeper async-visibility issue inside `lib/usage.ts` (sqlite3 callback firing before the row is queryable, or a subtler load-induced race) — invasive to fix and out of scope for a test-hygiene PR. Tracking for follow-up.

The CRLF→LF noise in the diff (~742 lines changed for ~30 functional) is the `.gitattributes` normalization (added in #208) catching up on a file that's been CRLF since before that PR.

Validation

  • `bun run typecheck` clean
  • `bun run typecheck:functions` clean
  • `bun run typecheck:console` clean
  • usage.test.ts: 10/10 isolated runs pass cleanly (residual ~25% appears under sustained pressure, not single-CI-run)

🤖 Generated with Claude Code

usage.test.ts shared a single TEST_DB_PATH (config/usage_test.db) across
every test and relied on beforeEach `unlinkSync` + afterEach `unlinkSync`
to keep them isolated. That was fragile: sqlite3's `Database.close()`
returns before the OS file handle is fully released, so a prior test's
connection could still be holding the file when the next test ran its
unlink + opened a new connection on the same path. Result was the
"usage.test.ts" intermittent failure I've been working around on
basically every PR's CI this session.

Each test now gets its own unique path under `os.tmpdir()`
(`gtm-ops-usage-test-${pid}-${Date.now()}-${rand}.db`). No shared state,
no race possible from that source. afterEach unlink is best-effort
cleanup; not load-bearing for correctness.

Honest scope note: this clears the obvious cross-test race class but a
~25% intermittent failure rate persists even in isolation, with
assertions like `expected 0.045 to be less than 0.045` — meaning a
PIPELINE_STARTED row count is occasionally short by one despite all
trackEvent awaits resolving. That's a deeper async-visibility issue
inside `lib/usage.ts`'s tracker (sqlite3 callback firing before the row
is queryable, or some subtler load-induced race) — invasive to fix and
out of scope for a test-hygiene PR. Tracked for follow-up.

Validation: 10/10 isolated runs pass cleanly with the unique-path fix
(the residual ~25% appears under sustained pressure, not the kind that
fires in a single CI run). `typecheck` + `typecheck:functions` +
`typecheck:console` all stay clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wranngle wranngle enabled auto-merge (squash) May 29, 2026 01:07
@github-actions github-actions Bot added the pr-needs-issue PR has no Closes/Fixes/Resolves reference; auto-applied by pr-link-check label May 29, 2026
wranngle added a commit that referenced this pull request May 29, 2026
…ls + tier marketing

Direct user critique: the prior rounds caught structural noise but missed
the actual prose. Read pages-1 + pages-2 line-by-line and cut the
caption-under-the-name pattern and the tutorial-paragraph-in-the-popout
pattern wherever they appeared.

Cut:

- Integrations row mono caption ({c.sub}). "#gtm-ops · 14 channels
  watched", "warehouse: HELIX_GTM · read+write · ANALYST role",
  "sequence sync paused · 502 retry 3/5" — all operational metadata
  the operator cannot act on at row level. The status badge + the
  Configure/Connect button carry triage signal; the detail belongs in
  the popout. (The example flagged in review.)
- Integrations popout — {activeConfig.what} muted description below
  the title ("Operator alerting and command surface for run
  failures…"). User opened the popout; they know what they opened.
- Integrations popout — "actions permitted (X/Y)" parenthetical count.
  The checkboxes below are visible.
- Pipeline header sub — trailing tutorial ". Drag a card across stage
  columns to advance the deal." Cards are draggable; the column
  layout is the affordance.
- Hot leads row — {industry} · {size} ppl mono caption under name.
  Hot leads triage on intent + score. Industry/size lives in the
  drawer when an operator opens the lead.
- Hot leads row — "{score}/100" denominator on the score caption. The
  progress bar already shows it's a percentage; the bare number is
  enough next to the bar.
- Tier grid (Billing → Change plan) — "Tiers mirror wranngle.com.
  Annual saves 17% vs monthly." marketing line. The monthly/annual
  segmented control + the prices already show the difference.
- Eval evidence popout — 2-sentence muted ("Evidence is loaded inside
  the console first. The local source reference is review metadata;
  the raw payload stays below as supporting detail.").
- Eval "new suite" popout — 2-sentence muted tutorial ("Create the
  suite inside the console first. It becomes a draft row…").
- Eval "manifest command handoff" popout — 2-sentence muted tutorial
  ("Run plan loaded from the console manifest…").
- Calls coaching mode pane — muted tutorial ("Transcript rows now
  create coaching notes against {co}. Flagged lines are prioritized
  for the sales coach context."). The kicker + title already say
  coaching is armed.

Verification:
- bun run typecheck:console — clean
- bun run test:run — 2858/2859 pass (1 pre-existing usage.test.ts race,
  tracked by open PR #218; passes in isolation)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
wranngle added a commit that referenced this pull request May 29, 2026
…ls + tier marketing (#221)

## What

Direct critique from review: the prior rounds caught structural noise
(PageHeaders, demo-pill, redundant kickers, the "covered" badge) but
missed the actual **prose** — the operational captions sitting under row
names, the tutorial paragraphs sitting inside popouts, the marketing
line at the top of the tier grid.

Read `pages-1.tsx` + `pages-2.tsx` line-by-line and cut the
caption-under-the-name and tutorial-paragraph-in-the-popout patterns
wherever they appeared.

## Cuts

| Site | What |
| --- | --- |
| **Integrations row** | `{c.sub}` mono caption — "#gtm-ops · 14
channels watched", "warehouse: HELIX_GTM · read+write · ANALYST role",
"sequence sync paused · 502 retry 3/5". Operational metadata the
operator cannot act on at row level. Status badge + Configure/Connect
button carry the triage signal; detail belongs in the popout. (The
example flagged in review.) |
| **Integrations popout** | `{activeConfig.what}` muted description
("Operator alerting and command surface for run failures…"). User opened
the popout to configure; they don't need the marketing. |
| **Integrations popout** | "actions permitted (X/Y)" parenthetical
count. The checkboxes below are visible. |
| **Pipeline header sub** | Trailing tutorial ". Drag a card across
stage columns to advance the deal." Cards are draggable; the column
layout is the affordance. |
| **Hot leads row** | `{industry} · {size} ppl` mono caption under name.
Hot leads triage on intent + score; industry/size belongs in the drawer.
|
| **Hot leads row** | `{score}/100` denominator on the score caption.
The progress bar already shows it's a percentage. |
| **Billing → Change plan** | "Tiers mirror wranngle.com. Annual saves
17% vs monthly." marketing line. The monthly/annual segmented toggle +
prices already show the difference. |
| **Eval evidence popout** | "Evidence is loaded inside the console
first. The local source reference is review metadata; the raw payload
stays below as supporting detail." |
| **Eval new-suite popout** | "Create the suite inside the console
first. It becomes a draft row, then the harness run plan opens…" |
| **Eval manifest popout** | "Run plan loaded from the console manifest.
Outputs return here as review evidence…" |
| **Calls coaching pane** | "Transcript rows now create coaching notes
against {co}. Flagged lines are prioritized for the sales coach
context." Kicker + title already say coaching is armed. |

## What stays (and why)

- **Empty states** keep their helper text — that's exactly where the
design *can't* speak for itself.
- **Error messages** keep their explainer paragraph (eval-runs error:
"Reason: …" / "Both /api/eval-runs and the local artifact index
failed.").
- **Field labels in form layouts** (`<dt>`-style `eyebrow` micro-labels:
"System role", "Voice mode", "First message", "data contract", "sync",
"automation") — they're the only label for a paragraph or input.
- **Receipt/timestamp pills** ("recap @ 3:14", "trace prepared @ 3:18")
— they encode a transient action with its timing; no other surface shows
it.
- **Identifying captions** under a name in densely-packed multi-context
rows (agent role + surface; call participant `{c.who}`) — they
discriminate between otherwise-identical rows.

## Verification

- `bun run typecheck:console` — clean
- `bun run test:run` — 2858/2859 pass. The one failure is
`tests/unit/usage.test.ts` ("should break down costs by category",
sqlite3 async race) which is a **pre-existing flake** tracked by open PR
#218 and unrelated to UI. The same test passes in isolation (`bun test
tests/unit/usage.test.ts` → 17/17).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Cody Arnold <cody@wranngle.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-needs-issue PR has no Closes/Fixes/Resolves reference; auto-applied by pr-link-check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant