Skip to content

feat: markdown result preview + failure-body fallback + uncapped success output#85

Open
cad0p wants to merge 47 commits into
tintinweb:masterfrom
cad0p:markdown-result-preview
Open

feat: markdown result preview + failure-body fallback + uncapped success output#85
cad0p wants to merge 47 commits into
tintinweb:masterfrom
cad0p:markdown-result-preview

Conversation

@cad0p

@cad0p cad0p commented May 29, 2026

Copy link
Copy Markdown

Brings markdown rendering + failure-body fallback to subagent completion notifications. Full design doc and 4-phase release-grade methodology audit trail (4 implementation review rounds + E2E with live pi + 4 bloat-reduction review rounds) in cad0p#1 — 47 commits squashed to one for review here.

What changes

  • subagent-notification body now renders via new Markdown(body, 2, 0, getMarkdownTheme()) (using @earendil-works/pi-tui + @earendil-works/pi-coding-agent exports already in use). Headings, bullets, tables, code fences, syntax highlighting all theme-styled.
  • resultPreviewMode: "plain" | "markdown" (default "markdown"). Plain mode is byte-equivalent to the current renderer; pinned by test/renderer-regression.test.ts snapshots.
  • resultPreviewExpanded: boolean (default true). When true, full body inline; ctrl+O is a no-op. When false, collapsed to first 10 lines + … (N more lines, ctrl+O to expand) hint (matches read / write tool precedent at core/tools/read.js:112, write.js:109).
  • failurePreviewMaxChars: number (default 65536). Cap applies only on error / stopped status; success / aborted / steered are uncapped. Removes the 500-char (individual) / 300-char (group) caps on success-path output, which forced get_subagent_result round-trips for any non-trivial result.
  • Failure-body fallback record.result ?? record.error ?? "" in both formatTaskNotification and buildNotificationDetails. Failed agents previously rendered "No output." because record.result is undefined on error / stopped paths (only set on success in agent-manager.ts:226, :358); the actual error text was visible only in the XML <status> field.

Backward compatibility

Default behavior changes (markdown rendering + uncapped success output) but the upstream behavior is fully reachable via resultPreviewMode: "plain" + resultPreviewExpanded: false. Snapshot test in test/renderer-regression.test.ts locks byte-equivalence (modulo timestamps).

Tests

  • 500 tests pass (npm test)
  • Typecheck/lint/build clean
  • stripInternal: true added to tsconfig.json so @internal testability seams (e.g. subagentNotificationRenderer, buildResultPreview) don't leak into dist/index.d.ts

Settings UI

Three new entries in /agents → Settings. Persist to existing .pi/subagents.json via the same global/project precedence as the existing settings.

Happy to split or scope this differently if you'd prefer — the design has no hard coupling between the markdown rendering, the cap removal, and the failure-body fallback.

@cad0p cad0p force-pushed the markdown-result-preview branch from a550ac2 to 8efe92d Compare June 1, 2026 03:02
cad0p added 29 commits June 1, 2026 03:05
… markdown

- Add markdown rendering path in renderBody with getMarkdownTheme()
- Honor resultPreviewExpanded to bypass collapse for default UX
- Add COLLAPSED_PREVIEW_LINES constant (10 lines, matching pi read/write precedent)
- Implement collapse hint for markdown mode with line count
- Add resultPreviewMode toggle (markdown ↔ plain)
- Add resultPreviewExpanded boolean toggle
- Add failurePreviewMaxChars numeric input with 1-1048576 range
- Update snapshotSettings, buildItems, and applyValue functions
…st mocks (ORCH-1, COV-1, COV-2, COV-5, COV-6, SB-4)

- Extract renderer functions to module scope for testability
- Rewrite renderer-regression.test.ts to use actual subagentNotificationRenderer
- Rewrite renderer-markdown.test.ts to use production render functions
- Rewrite renderer-expanded-setting.test.ts to test actual renderer behavior
- Rewrite menu-structure.test.ts to verify expected menu structure
- Add COV-1: code fence cut mid-block test
- Add COV-2: surrogate pair boundary truncation tests
- Add COV-5: mixed success/failure status tests
- Add COV-6: applySettings wiring tests
- Fix SB-4: correct test comment about production default behavior
- High surrogate at exact boundary: input ending at position 1999 (high surrogate of last emoji) drops unpaired surrogate, exercising safeTruncate's drop-trailing-high-surrogate path
- Isolated low surrogate: bare low surrogate in malformed input does not crash production
- No truncation needed: safeTruncate('hello', 5) returns 'hello' unchanged
- Cap zero: safeTruncate with cap=0 returns truncation message

Addresses R3-CORR-1, R3-COV-1, R3-COV-2
Both formatTaskNotification and buildNotificationDetails contained identical result preview logic with fallback chain (result ?? error ?? empty) and failure-mode truncation. Extracted to buildResultPreview() helper for DRY compliance and consistent behavior between XML and UI paths.

Addresses R3-SB-2
- High surrogate at exact boundary: input ending at position 1999 (high surrogate of last emoji) drops unpaired surrogate, exercising safeTruncate's drop-trailing-high-surrogate path
- Isolated low surrogate: bare low surrogate in malformed input does not crash production
- No truncation needed: safeTruncate('hello', 5) returns 'hello' unchanged
- Cap zero: safeTruncate with cap=0 returns truncation message

Addresses R3-CORR-1, R3-COV-1, R3-COV-2
…cation

The expect(true).toBe(true) assertion with comment 'Production default verified by grep: resultPreviewExpanded = true' is a tautology that doesn't verify anything about production code. Other tests in the same file already cover the resultPreviewExpanded behavior implicitly through override testing.

Addresses R3-SB-1, R3-CLEAN-5, R3-COV-4
Rewritten test to create 15 lines with fence opening at line 8 and closing at line 14. Truncation at COLLAPSED_PREVIEW_LINES=10 now actually cuts the fence mid-block, leaving 3 lines of code visible without the closing fence marker.

Addresses R3-SB-4
…ions

Added snapshot variants where d.others is populated to verify group rendering paths:
- 2 agents: completed + completed
- 2 agents: completed + error (mixed)
- 3 agents: completed + steered + aborted (all success-ish)

Snapshots lock the byte-equivalent output for group rendering with different status combinations.

Addresses R3-COV-3
Changed inline comment to explain that 1999 = (1000 emoji × 2 UTF-16 units) - 1, which cuts the LAST emoji's high surrogate and exercises safeTruncate's drop-trailing-high-surrogate path. This makes the boundary value meaningful rather than arbitrary.

Addresses R3-SB-3
cad0p added 18 commits June 1, 2026 03:05
…lent fallback (BLOAT-1)

The previous BLOAT-1 commit (e260e35) only added parentheses; the speculative
`?? DEFAULT_FAILURE_PREVIEW_MAX_CHARS` fallback survived. This pass applies the
reviewer's option (c) — runtime assertion that surfaces the failure-cap
contract instead of silently falling back to a duplicate default.

DEFAULT_FAILURE_PREVIEW_MAX_CHARS remains in use as the module-scope
initializer (load-order-required before applySettings runs), per BLOAT-4
clarification.
…VERTRIM-1)

Commit 5cd8dda merged stacked /** descriptive */ /** @internal */ blocks
into single /** @internal description */ form for the @internal testability
exports. Three of five functions (formatTaskNotification,
buildNotificationDetails, subagentNotificationRenderer) got the proper merge.
The other two (subagentNotificationRenderHeader, subagentNotificationRenderBody)
got their descriptions deleted instead of merged. Restore consistency by
applying the same merged-form pattern.
…only (DOC2-1)

The constant's role (module-scope initializer for the local
failurePreviewMaxChars at the bottom of index.ts) is encoded by the symbol
name and the single use site. The "mirrors settings.ts default until
applySettings runs" framing is provenance, not contract. Reflagged across
two rounds — the right fix is to drop the provenance half, not edit the
comment further.
This file declares all-local literals and asserts on those literals. It does
not import from src/ — a refactor that removed resultPreviewMode from production
would still leave it green. Settings semantics are pinned by
test/result-preview-settings.test.ts (real production import) and Phase 2 E2E
scenario 5 (live menu wiring). No regression contract is lost by deleting this.
… (B2)

The 4 UI-path it() blocks in notification-format.test.ts (calling
buildNotificationDetails) duplicate identical assertions in
notification-details.test.ts. The format file's stated scope is
formatTaskNotification (XML payload) — UI path was scope drift. R3 surrogate
boundary findings remain pinned by the originals in notification-details.test.ts.
…ed tests (B3)

test/renderer-expanded-setting.test.ts had 5 tests that all asserted
length === 2 with single-line resultPreview fixtures — they pass identically
whether the renderer is in expanded or collapsed mode. A future refactor
that inverted the effectiveExpanded ternary would still leave them green.

Replace with 2 tests in renderer-markdown.test.ts using a 50-line
resultPreview, asserting hint presence/absence in markdown.text to actually
distinguish the branches. Real regression locks instead of smoke tests.
The handles-very-large-input-without-performance-issues test asserted
duration < 100ms against 1MB ASCII operations — a magic threshold not
backed by any reviewer-pinned source, flaky on shared CI runners under
load. Large-input behavior is covered structurally by the 100KB tests
in notification-details.test.ts.
…ion.test.ts (B8)

The structural-properties-remain-consistent test asserted children.length === 2,
but the two toMatchInlineSnapshot tests above already lock the rendered shape.
Comment claimed structural test was needed for "components that can't be
rendered to text" — the snapshot tests in fact rendered to text via
extractRenderedText, so the rationale was stale.
…act (B9)

The empty body + markdown mode renders sensibly test asserted
container.children.length >= 0 — unconditionally true. The genuine no-crash
contract is now expressed as expect(...).not.toThrow(), making the
assertion actually testing what the test name claims.
…MaxChars (R2-T1)

Commit 84cc4fb added a runtime contract assertion in buildResultPreview that
throws when failurePreviewMaxChars is undefined on error/stopped status. Every
existing failure-status test fixture pairs status with a numeric setting, so
the throw path was reachable but uncovered. This test pins the contract.
…on-format.test.ts (TR3-T1)

B2 (398b77e) deleted 4 of the 5 duplicate UI-path tests in
notification-format.test.ts but missed `buildNotificationDetails handles
surrogate pairs in UI path`. Its byte-equivalent counterpart is at
test/notification-details.test.ts:94 (it("UI path: surrogate pairs handled
gracefully", ...)). The contract (UI-path surrogate-pair handling at
boundary slice) remains pinned by the surviving copy plus the analogous
XML-path test in notification-format.test.ts:18.

Closes B2 cleanly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant