diff --git a/CONTRIBUTING-PLUGINS.md b/CONTRIBUTING-PLUGINS.md
deleted file mode 100644
index 62f88af..0000000
--- a/CONTRIBUTING-PLUGINS.md
+++ /dev/null
@@ -1,151 +0,0 @@
-# Plugin Contributor Guide
-
-This guide is for developers building plugins on top of `repterm`.
-
-## Who should read this
-
-- You are creating a reusable plugin package for terminal testing workflows.
-- You want stable public types from `repterm-api` without coupling to the full runner internals.
-- You only use Bun-based tooling (`bun`, `bunx`, `bun test`, `bun publish`).
-
-## Plugin developer workflow (Bun only)
-
-1. Install dependencies:
-
-```bash
-bun install
-```
-
-2. Build plugin and API package during development:
-
-```bash
-bun run build:plugin-api
-bun run build:plugin-kubectl
-```
-
-3. Run tests (full workspace):
-
-```bash
-bun run test
-```
-
-4. Lint core framework code before publishing changes:
-
-```bash
-bun run lint
-```
-
-## Package model
-
-- `repterm`: runtime and CLI test runner.
-- `repterm-api`: shared plugin interfaces, matcher types, and plugin helpers.
-- Plugin packages (for example `@nexusgpu/repterm-plugin-kubectl`): depend on `repterm-api` and expose domain-specific methods and hooks.
-
-## Authoring rules for high-quality plugins
-
-- Keep plugin APIs small and task-focused.
-- Prefer deterministic commands and assertions over timing-based heuristics.
-- Expose strongly typed methods and avoid leaking internal runner objects.
-- Design for CI reliability first: idempotent setup, clear cleanup, stable diagnostics.
-
-## Simple plugin example (context + terminal)
-
-`definePlugin` takes `(name, setup)` and `setup` receives plugin context.
-Use `ctx.testContext.terminal` to execute commands from plugin methods.
-
-```ts
-import { definePlugin, type PluginContext } from 'repterm-api';
-
-export const shellHelpers = definePlugin('shellHelpers', (ctx: PluginContext) => ({
- methods: {
- pwd: async () => {
- const result = await ctx.testContext.terminal.run('pwd');
- if (result.code !== 0) {
- throw new Error(`pwd failed: ${result.stderr}`);
- }
- return result.stdout.trim();
- },
- snapshot: async () => ctx.testContext.terminal.snapshot(),
- },
- context: {
- shellProfile: 'bash',
- },
-}));
-```
-
-## SSH plugin example (stateful + hooks)
-
-This example keeps lightweight session state in closure, runs remote commands through the test terminal,
-and cleans up in `afterTest` hook.
-
-```ts
-import { definePlugin, type PluginContext } from 'repterm-api';
-
-type SshTarget = {
- host: string;
- user: string;
- port?: number;
-};
-
-export const ssh = definePlugin('ssh', (ctx: PluginContext) => {
- const state: {
- target?: SshTarget;
- connected: boolean;
- } = {
- connected: false,
- };
-
- return {
- methods: {
- connect: async (target: SshTarget) => {
- state.target = target;
- state.connected = true;
- },
- runRemote: async (command: string) => {
- if (!state.connected || !state.target) {
- throw new Error('SSH not connected. Call connect() first.');
- }
-
- const port = state.target.port ?? 22;
- const escaped = command.replace(/"/g, '\\"');
- const sshCommand = `ssh -p ${port} ${state.target.user}@${state.target.host} "${escaped}"`;
-
- const result = await ctx.testContext.terminal.run(sshCommand, { timeout: 30_000 });
- if (result.code !== 0) {
- throw new Error(`SSH command failed: ${result.stderr}`);
- }
-
- return result.stdout;
- },
- disconnect: async () => {
- state.connected = false;
- state.target = undefined;
- },
- },
- hooks: {
- afterTest: async () => {
- // Ensure state does not leak between tests.
- state.connected = false;
- state.target = undefined;
- },
- },
- context: {
- sshConnected: () => state.connected,
- },
- };
-});
-```
-
-## Publishing checklist
-
-- Ensure package-level `README.md` exists and is up to date.
-- Confirm imports use `repterm-api` public exports.
-- Run `bun run test` and `bun run build` successfully.
-- Publish via workspace scripts (Bun):
-
-```bash
-bun run publish:plugin-api
-bun run publish:plugin-kubectl
-```
-
-If you add a new plugin package, also add docs and usage examples under that package directory.
diff --git a/README.md b/README.md
index b4157d4..222fbcd 100644
--- a/README.md
+++ b/README.md
@@ -1,81 +1,187 @@
-# Repterm
+
+ Repterm
+
+
+
+ Terminal-first test framework for CLI/TUI apps — run tests in a real PTY, not just stdout.
+
+
+
+
+
+
+
+
+
+---
+
+```typescript
+import { test, expect } from 'repterm';
+
+test('greet the world', async ({ $ }) => {
+ const result = await $`echo "Hello, Repterm!"`;
+ expect(result).toSucceed();
+ expect(result).toHaveStdout('Hello, Repterm!');
+});
+```
+
+```bash
+$ bunx repterm tests/
+ PASS greet the world (120ms)
+```
-Repterm is a terminal-first test framework for CLI/TUI applications.
-It runs tests in a real PTY so you can assert on interactive terminal behavior, not just plain stdout.
+## Why Repterm?
-## Packages
+**Write tests the way your users use your CLI.** Simple commands run via `Bun.spawn` for precise stdout/stderr/exitCode. When you need interactive testing — prompts, TUI redraws, progress bars — flip on `{ interactive: true }` and Repterm spawns a real PTY with full send/expect control.
+
+**Familiar, structured API.** If you've used Playwright or Vitest, you already know how to use Repterm. `test()`, `describe()`, `expect()` — plus a `$` tagged template for running commands with automatic shell escaping.
-- `repterm`: runner + CLI
-- `repterm-api`: plugin/matcher API for extension authors
-- `@nexusgpu/repterm-plugin-kubectl`: kubectl-focused plugin
+**Tests become documentation.** Run with `--record` and every test produces an [asciinema](https://asciinema.org/) recording. Your test suite generates always-up-to-date terminal demos — no manual recording sessions.
-Plugin contributors: see [Plugin Contributor Guide](CONTRIBUTING-PLUGINS.md).
+**Parallel and fast.** Scale with `--workers N`. Each test gets its own isolated session.
-## AI Agent Skill (skills.sh)
+**Extensible by design.** Build plugins to add domain-specific commands and matchers. The official [kubectl plugin](https://repterm.ai/docs/kubectl/overview) adds Kubernetes-aware testing with assertions like `toBeRunning()` and `toHaveReplicas()`.
-This repository includes the `repterm` agent skill at `skills/repterm/`.
+## Features
-Install from GitHub:
+- **Dual execution modes** — precise `Bun.spawn` by default; opt into a real PTY with `{ interactive: true }` for colors, cursor control, and TUI testing
+- **Playwright-style API** — familiar `expect()` assertions, `proc.send()` / `proc.expect()` for interactive flows
+- **`$` tagged templates** — run commands with automatic shell escaping: `` await $`echo ${userInput}` ``
+- **Parallel test runner** — execute tests concurrently with `--workers N`
+- **Terminal recording** — generate [asciinema](https://asciinema.org/) recordings with `--record`
+- **Plugin system** — extend test contexts with custom helpers, matchers, and lifecycle hooks
+- **Multi-terminal** — spin up multiple PTY sessions in a single test for client-server scenarios
+
+## Quick Start
+
+### Install
```bash
-npx skills add NexusGPU/repterm --skill repterm
+bun add -d repterm
```
-Local discovery check:
+Or install the standalone binary:
```bash
-npx skills add . --list
+curl -fsSL https://cdn.tensor-fusion.ai/archive/repterm/install.sh | sh
```
-## Install
+
+Windows (PowerShell)
-```bash
-bun add -d repterm
+```powershell
+iwr https://cdn.tensor-fusion.ai/archive/repterm/install.ps1 -UseBasicParsing | iex
```
-Run tests:
+
+
+### Write a Test
+
+Create `tests/demo.ts`:
+
+```typescript
+import { test, expect, describe } from 'repterm';
+
+describe('my CLI', () => {
+ test('exits with code 0', async ({ $ }) => {
+ const result = await $`echo "it works"`;
+ expect(result).toSucceed();
+ expect(result).toHaveStdout('it works');
+ });
+
+ test('handles stderr', async ({ $ }) => {
+ const result = await $`echo "oops" >&2`;
+ expect(result).toHaveStderr('oops');
+ });
+});
+```
+
+### Run
```bash
bunx repterm tests/
-bunx repterm --workers 4 tests/
-bunx repterm --record tests/
```
-## Install Binary (from R2/CDN)
+## Interactive Testing
+
+Test interactive programs like prompts, editors, or TUI apps:
+
+```typescript
+test('interactive prompt', async ({ $ }) => {
+ const proc = $({ interactive: true })`bash -c 'read -p "Name: " n; echo "Hi $n"'`;
-The release workflow uploads standalone binaries to Cloudflare R2 using this layout:
+ await proc.expect('Name:');
+ await proc.send('Alice');
+ await proc.expect('Hi Alice');
+});
+```
-- `.../archive/repterm/latest/repterm--`
-- `.../archive/repterm/v/repterm--`
+## Parallel Execution
-### Linux/macOS
+Run tests across multiple workers:
```bash
-curl -fsSL https://cdn.tensor-fusion.ai/archive/repterm/install.sh | sh
+bunx repterm --workers 4 tests/
```
-Install a specific version and custom source:
+## Recording
+
+Generate terminal recordings for docs or CI artifacts:
```bash
-curl -fsSL https://cdn.tensor-fusion.ai/archive/repterm/install.sh \
- | REPTERM_VERSION=v0.2.0 REPTERM_BASE_URL=https://cdn.tensor-fusion.ai/archive/repterm sh
+bunx repterm --record tests/
```
-Optional environment variables for `scripts/install.sh`:
+Produces [asciinema](https://asciinema.org/)-compatible `.cast` files you can embed or replay.
-- `REPTERM_VERSION`: default `latest`
-- `REPTERM_BASE_URL`: default `https://cdn.tensor-fusion.ai/archive/repterm`
-- `REPTERM_INSTALL_DIR`: default `/usr/local/bin`
+## Plugins
-### Windows (PowerShell)
+Repterm has a plugin system for domain-specific testing. The first official plugin adds Kubernetes support:
-```powershell
-$env:REPTERM_VERSION = "latest"
-$env:REPTERM_BASE_URL = "https://cdn.tensor-fusion.ai/archive/repterm"
-iwr https://cdn.tensor-fusion.ai/archive/repterm/install.ps1 -UseBasicParsing | iex
+```typescript
+import { defineConfig, createTestWithPlugins, expect } from 'repterm';
+import { kubectlPlugin, pod } from '@nexusgpu/repterm-plugin-kubectl';
+
+const config = defineConfig({
+ plugins: [kubectlPlugin({ namespace: 'default' })] as const,
+});
+
+const test = createTestWithPlugins(config);
+
+test('pod is running', async (ctx) => {
+ const { kubectl } = ctx.plugins;
+ await kubectl.apply('manifests/nginx.yaml');
+ await kubectl.waitForPod('nginx', 'Running');
+ await expect(pod('nginx')).toBeRunning();
+});
```
-## Examples
+Build your own plugins with [`repterm-api`](https://www.npmjs.com/package/repterm-api) — see the [Plugin Guide](https://repterm.ai/docs/plugins/overview).
+
+## Packages
+
+| Package | Description |
+|---------|-------------|
+| [`repterm`](https://www.npmjs.com/package/repterm) | Core framework + CLI runner |
+| [`repterm-api`](https://www.npmjs.com/package/repterm-api) | Plugin/matcher API for extension authors |
+| [`@nexusgpu/repterm-plugin-kubectl`](https://www.npmjs.com/package/@nexusgpu/repterm-plugin-kubectl) | Kubernetes testing plugin |
+
+## Documentation
+
+Full documentation is available at **[repterm.ai](https://repterm.ai)**.
+
+- [Getting Started](https://repterm.ai/docs/getting-started)
+- [Writing Tests](https://repterm.ai/docs/guides/writing-tests)
+- [Interactive Commands](https://repterm.ai/docs/guides/interactive-commands)
+- [Recording](https://repterm.ai/docs/guides/recording)
+- [Plugin API](https://repterm.ai/docs/plugins/overview)
+- [Kubectl Plugin](https://repterm.ai/docs/kubectl/overview)
+- [API Reference](https://repterm.ai/docs/api/assertions)
+
+## Contributing
+
+See the [Plugin Development Guide](https://repterm.ai/docs/plugins/creating-plugins) to build your own plugins.
+
+## License
-- Repterm examples: `packages/repterm/examples/README.md`
-- Kubectl plugin examples: `packages/plugin-kubectl/examples/README.md`
+[Apache-2.0](LICENSE)
diff --git a/specs/001-tui-test-framework/checklists/requirements.md b/specs/001-tui-test-framework/checklists/requirements.md
deleted file mode 100644
index b3ce35e..0000000
--- a/specs/001-tui-test-framework/checklists/requirements.md
+++ /dev/null
@@ -1,34 +0,0 @@
-# Specification Quality Checklist: CLI/TUI Test Framework
-
-**Purpose**: Validate specification completeness and quality before proceeding to planning
-**Created**: 2026-01-26
-**Feature**: `/Users/bailian/tensor-fusion/repterm/specs/001-tui-test-framework/spec.md`
-
-## Content Quality
-
-- [x] No implementation details (languages, frameworks, APIs)
-- [x] Focused on user value and business needs
-- [x] Written for non-technical stakeholders
-- [x] All mandatory sections completed
-
-## Requirement Completeness
-
-- [x] No [NEEDS CLARIFICATION] markers remain
-- [x] Requirements are testable and unambiguous
-- [x] Success criteria are measurable
-- [x] Success criteria are technology-agnostic (no implementation details)
-- [x] All acceptance scenarios are defined
-- [x] Edge cases are identified
-- [x] Scope is clearly bounded
-- [x] Dependencies and assumptions identified
-
-## Feature Readiness
-
-- [x] All functional requirements have clear acceptance criteria
-- [x] User scenarios cover primary flows
-- [x] Feature meets measurable outcomes defined in Success Criteria
-- [x] No implementation details leak into specification
-
-## Notes
-
-- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
diff --git a/specs/001-tui-test-framework/contracts/openapi.yaml b/specs/001-tui-test-framework/contracts/openapi.yaml
deleted file mode 100644
index 291f9dc..0000000
--- a/specs/001-tui-test-framework/contracts/openapi.yaml
+++ /dev/null
@@ -1,131 +0,0 @@
-openapi: 3.0.3
-info:
- title: Repterm Test Runner API
- version: 0.1.0
-servers:
- - url: http://localhost:0
-paths:
- /runs:
- post:
- summary: Start a test run
- requestBody:
- required: true
- content:
- application/json:
- schema:
- $ref: '#/components/schemas/RunRequest'
- responses:
- '202':
- description: Run accepted
- content:
- application/json:
- schema:
- $ref: '#/components/schemas/RunAccepted'
- /runs/{runId}:
- get:
- summary: Get run status and summary
- parameters:
- - name: runId
- in: path
- required: true
- schema:
- type: string
- responses:
- '200':
- description: Run status
- content:
- application/json:
- schema:
- $ref: '#/components/schemas/RunStatus'
- /runs/{runId}/artifacts:
- get:
- summary: List artifacts for a run
- parameters:
- - name: runId
- in: path
- required: true
- schema:
- type: string
- responses:
- '200':
- description: Artifacts list
- content:
- application/json:
- schema:
- $ref: '#/components/schemas/ArtifactList'
-components:
- schemas:
- RunRequest:
- type: object
- required: [testPaths]
- properties:
- testPaths:
- type: array
- items:
- type: string
- record:
- type: object
- properties:
- enabled:
- type: boolean
- castFile:
- type: string
- parallel:
- type: object
- properties:
- workers:
- type: integer
- minimum: 1
- timeouts:
- type: object
- properties:
- suiteMs:
- type: integer
- minimum: 1
- testMs:
- type: integer
- minimum: 1
- RunAccepted:
- type: object
- required: [runId]
- properties:
- runId:
- type: string
- RunStatus:
- type: object
- required: [runId, status]
- properties:
- runId:
- type: string
- status:
- type: string
- enum: [queued, running, passed, failed]
- totals:
- type: object
- properties:
- passed:
- type: integer
- failed:
- type: integer
- skipped:
- type: integer
- durationMs:
- type: integer
- minimum: 0
- ArtifactList:
- type: object
- required: [items]
- properties:
- items:
- type: array
- items:
- $ref: '#/components/schemas/Artifact'
- Artifact:
- type: object
- required: [type, path]
- properties:
- type:
- type: string
- enum: [cast, log, snapshot]
- path:
- type: string
diff --git a/specs/001-tui-test-framework/data-model.md b/specs/001-tui-test-framework/data-model.md
deleted file mode 100644
index bb2583b..0000000
--- a/specs/001-tui-test-framework/data-model.md
+++ /dev/null
@@ -1,26 +0,0 @@
-## Data Model
-
-### Test Suite
-- **Fields**: `id`, `name`, `tests[]`, `config` (timeouts, parallel, record)
-- **Relationships**: has many `Test Case`
-- **Validation**: name required; timeouts must be positive integers
-
-### Test Case
-- **Fields**: `id`, `name`, `steps[]`, `timeout`, `fixtures`
-- **Relationships**: belongs to `Test Suite`; has many `Step`
-- **Validation**: name required; steps must be non-empty
-
-### Step
-- **Fields**: `id`, `type` (input/wait/assert), `payload`, `timeout`
-- **Relationships**: belongs to `Test Case`
-- **Validation**: type required; timeout optional but must be positive if set
-
-### Run Result
-- **Fields**: `id`, `suiteId`, `caseId`, `status` (pass/fail), `durationMs`, `error`
-- **Relationships**: has many `Artifact`
-- **Validation**: duration non-negative; status required
-
-### Artifact
-- **Fields**: `id`, `runResultId`, `type` (cast/log/snapshot), `path`
-- **Relationships**: belongs to `Run Result`
-- **Validation**: path required; type required
diff --git a/specs/001-tui-test-framework/plan.md b/specs/001-tui-test-framework/plan.md
deleted file mode 100644
index 8b3b888..0000000
--- a/specs/001-tui-test-framework/plan.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Implementation Plan: CLI/TUI Test Framework
-
-**Branch**: `001-tui-test-framework` | **Date**: 2026-01-26 | **Spec**: `/specs/001-tui-test-framework/spec.md`
-**Input**: Feature specification from `/specs/001-tui-test-framework/spec.md`
-
-**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.
-
-## Summary
-
-Deliver a minimal TypeScript TUI testing framework with PlaywrightAPI-style authoring, CLI execution, parallel runs, and clear pass/fail output. Provide two execution modes: recording (asciinema + tmux, with human-like typing) and non-recording (direct user terminal), following the flow in `simple-example.js` and referencing implementation patterns from `playwright/`.
-
-## Technical Context
-
-**Language/Version**: TypeScript (ESM) on Node.js 20.11.0
-**Primary Dependencies**: node-pty, asciinema CLI (external), tmux (external); Playwright repo as style reference (no runtime dependency)
-**Storage**: Local artifacts only (asciinema `.cast` files, text snapshots)
-**Testing**: Node.js `node:test` + `assert` for framework tests
-**Target Platform**: macOS/Linux terminals with PTY support
-**Project Type**: Single project
-**Performance Goals**: p95 input-to-output handling under 50ms in non-recording mode; recording overhead <20% runtime vs non-recording for typical tests
-**Constraints**: Must run on Node 20.11.0 to support `node-pty`; requires `asciinema` and `tmux` installed for recording mode; no Windows support for MVP
-**Scale/Scope**: 10–100 tests per run, parallelism up to 4 workers
-
-## Constitution Check
-
-*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
-
-- MVP scope defined; P1 delivers independent value
-- Solution is the simplest viable approach; any extra complexity justified
-- Test strategy defined for each user story (unit/integration/e2e as needed)
-- UX consistency noted (patterns, copy, navigation)
-- Performance goals and constraints documented
-
-## Project Structure
-
-### Documentation (this feature)
-
-```text
-specs/001-tui-test-framework/
-├── plan.md # This file (/speckit.plan command output)
-├── research.md # Phase 0 output (/speckit.plan command)
-├── data-model.md # Phase 1 output (/speckit.plan command)
-├── quickstart.md # Phase 1 output (/speckit.plan command)
-├── contracts/ # Phase 1 output (/speckit.plan command)
-└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
-```
-
-### Source Code (repository root)
-
-```text
-src/
-├── api/ # Playwright-style public API: test/expect/fixtures
-├── cli/ # CLI entrypoint and reporting
-├── runner/ # scheduler, parallel workers, test lifecycle
-├── terminal/ # node-pty wiring, tmux integration
-├── recording/ # asciinema session control and artifacts
-└── utils/ # shared helpers (timing, typing cadence)
-
-tests/
-├── unit/ # small pure logic tests
-├── integration/ # PTY + process wiring (mocked where possible)
-└── e2e/ # full CLI run against simple-example flow
-```
-
-**Structure Decision**: Single-project layout to minimize overhead while keeping API, CLI, and terminal/recording concerns separated.
-
-## Complexity Tracking
-
-> **Fill ONLY if Constitution Check has violations that must be justified**
-
-| Violation | Why Needed | Simpler Alternative Rejected Because |
-|-----------|------------|-------------------------------------|
diff --git a/specs/001-tui-test-framework/quickstart.md b/specs/001-tui-test-framework/quickstart.md
deleted file mode 100644
index ecba51f..0000000
--- a/specs/001-tui-test-framework/quickstart.md
+++ /dev/null
@@ -1,182 +0,0 @@
-## Quickstart
-
-Plugin contributors: see [Plugin Contributor Guide](../../CONTRIBUTING-PLUGINS.md).
-
-### Prerequisites
-- Bun 1.3+
-- `asciinema` and `tmux` installed for recording mode
-
-### Install (local development)
-```bash
-bun install
-```
-
-### Example: Run a single terminal test
-```ts
-import { test, expect } from 'repterm';
-
-test('cli test!!', async ({ terminal }) => {
- await terminal.start('echo "Hello, world!"');
- // Wait for the output to appear
- await terminal.waitForText('Hello, world!', { timeout: 5000 });
- await expect(terminal).toContainText('Hello, world!');
-});
-```
-
-### Example: Run a multi-terminal test
-```ts
-/**
- * Multi-window HTTP server test example
- *
- * This test demonstrates:
- * 1. Starting an HTTP server in one terminal window
- * 2. Making requests to the server from another terminal window
- * 3. Validating the response data
- */
-import { test, expect, terminalFactory } from 'repterm';
-
-test('multi-window: POST request with body', async ({ terminal: serverTerminal }) => {
- // Terminal 1: Start server that handles POST requests
- await serverTerminal.start(`cat > /tmp/post-server.js << 'SCRIPT'
-const http = require('http');
-const server = http.createServer((req, res) => {
- let body = '';
- req.on('data', chunk => body += chunk);
- req.on('end', () => {
- res.writeHead(200, { 'Content-Type': 'application/json' });
- res.end(JSON.stringify({
- received: body,
- method: req.method,
- contentType: req.headers['content-type']
- }));
- });
-});
-server.listen(8768, () => console.log('POST server ready on port 8768'));
-SCRIPT`);
-
- await serverTerminal.start('node /tmp/post-server.js');
- await serverTerminal.waitForText('POST server ready on port 8768', { timeout: 10000 });
-
- // Terminal 2: Client terminal (created as a pane in the same tmux session)
- const clientTerminal = await terminalFactory.create();
-
- // Send POST request with JSON body
- await clientTerminal.start(
- `curl -s -X POST -H "Content-Type: application/json" -d '{"name":"repterm","version":"1.0"}' http://localhost:8768/`
- );
-
- // Validate server received the POST body
- // Wait for 'received' which only appears in the response JSON, not in the command
- await clientTerminal.waitForText('received', { timeout: 5000 });
-
- // Note: Due to line wrapping in narrow panes, we check for substrings
- // that are likely to be on a single line
- await expect(clientTerminal).toContainText('"POST"');
- await expect(clientTerminal).toContainText('received');
- await expect(clientTerminal).toContainText('repterm');
-
- // Note: Don't manually close the client terminal - let the framework handle cleanup
- // This ensures the multi-pane view is captured in the recording
-});
-```
-
-### Example: Organized test suites with test.describe
-
-```ts
-import { test, describe, expect } from 'repterm';
-
-describe('Authentication', () => {
- test('should login successfully', async ({ terminal }) => {
- await terminal.start('echo "Login successful"');
- await expect(terminal).toContainText('Login successful');
- });
-
- test('should handle invalid credentials', async ({ terminal }) => {
- await terminal.start('echo "Invalid credentials"');
- await expect(terminal).toContainText('Invalid credentials');
- });
-});
-
-describe('User Profile', () => {
- test('should display user info', async ({ terminal }) => {
- await terminal.start('echo "User: admin"');
- await expect(terminal).toContainText('User: admin');
- });
-});
-```
-
-### Example: Using test.step for better test organization
-
-```ts
-import { test, expect } from 'repterm';
-
-test('database migration', async ({ terminal }) => {
- await test.step('Connect to database', async () => {
- await terminal.start('psql -U postgres');
- await terminal.waitForText('postgres=#', { timeout: 5000 });
- });
-
- await test.step('Run migration', async () => {
- await terminal.start('\\i migrations/001_create_users.sql');
- await expect(terminal).toContainText('CREATE TABLE');
- });
-
- await test.step('Verify schema', async () => {
- await terminal.start('\\dt');
- await expect(terminal).toContainText('users');
- });
-});
-```
-
-### Example: Using hooks for setup and teardown
-
-```ts
-import { test, beforeEach, afterEach, expect } from 'repterm';
-
-beforeEach(async ({ terminal }) => {
- // Setup before each test
- await terminal.start('mkdir -p /tmp/test-data');
-});
-
-afterEach(async ({ terminal }) => {
- // Cleanup after each test
- await terminal.start('rm -rf /tmp/test-data');
-});
-
-test('should use temp directory', async ({ terminal }) => {
- await terminal.start('ls /tmp/test-data');
- await expect(terminal).toContainText('test-data');
-});
-```
-
-## CLI Usage
-
-### Run tests (non-recorded)
-```bash
-bunx repterm tests/example.test.ts
-```
-
-### Run tests with recording enabled
-```bash
-bunx repterm --record tests/example.test.ts
-```
-
-### Run tests in parallel with 4 workers
-```bash
-bunx repterm --workers 4 tests/
-```
-
-### Run tests with custom timeout
-```bash
-bunx repterm --timeout 60000 tests/
-```
-
-### Run tests with verbose output
-```bash
-bunx repterm --verbose tests/
-```
-
-### Show help
-```bash
-bunx repterm --help
-```
diff --git a/specs/001-tui-test-framework/research.md b/specs/001-tui-test-framework/research.md
deleted file mode 100644
index 72863b1..0000000
--- a/specs/001-tui-test-framework/research.md
+++ /dev/null
@@ -1,25 +0,0 @@
-## Research Summary
-
-### Decision: Use Node.js 20.11.0 + TypeScript (ESM)
-**Rationale**: Requirement mandates Node 20.11.0 for `node-pty` compatibility; TypeScript provides typed public API for Playwright-style authoring.
-**Alternatives considered**: Using older Node versions (rejected by requirement); CommonJS (adds friction with TS ESM types and tooling).
-
-### Decision: Recording mode uses `asciinema rec` driven by `node-pty`
-**Rationale**: Matches `simple-example.js` behavior, captures terminal output deterministically, and keeps the runtime local without extra services.
-**Alternatives considered**: Direct terminal capture without asciinema (misses required recording artifacts); OS-level screen recording (too heavy and inconsistent).
-
-### Decision: Multi-window tests use `tmux` as the entry point
-**Rationale**: When users open multiple terminal sessions in a single test, `tmux` provides a single recording surface and supports `split-window` to show concurrent panes.
-**Alternatives considered**: Multiple independent PTYs (not visible in one recording); nested asciinema sessions (fragile and hard to sync).
-
-### Decision: Human-like typing simulated via per-character writes with jitter
-**Rationale**: Requirement explicitly asks for human-like typing in recordings; per-char writes with randomized delays creates natural motion in the `.cast`.
-**Alternatives considered**: Instant input (fails the recording requirement); fixed delay without jitter (looks robotic).
-
-### Decision: Playwright-style API surface for authoring
-**Rationale**: Requirement mandates a Playwright-like API and the repo already includes Playwright source for reference; align naming (`test`, `expect`, `describe`, `test.step`).
-**Alternatives considered**: Custom DSL (higher learning curve, conflicts with requirements).
-
-### Decision: Parallel execution via worker processes with isolated PTY state
-**Rationale**: Meets FR-004 by isolating terminal state and artifacts per worker; keeps failures contained while allowing concurrency.
-**Alternatives considered**: In-process parallelism (risks shared PTY state and flaky output).
diff --git a/specs/001-tui-test-framework/spec.md b/specs/001-tui-test-framework/spec.md
deleted file mode 100644
index 4d62dcb..0000000
--- a/specs/001-tui-test-framework/spec.md
+++ /dev/null
@@ -1,145 +0,0 @@
-# Feature Specification: CLI/TUI Test Framework
-
-**Feature Branch**: `001-tui-test-framework`
-**Created**: 2026-01-26
-**Status**: Draft
-**Input**: User description: "构建一个 CLI/TUI 测试框架,可以让用户使用类似于 Playwright 风格的 API 为终端(TUI)的测试编写测试用例,为用户提供可维护性高、可用性强、可并行的测试框架。"
-
-## User Scenarios & Testing *(mandatory)*
-
-### User Story 1 - Write and Run TUI Tests (Priority: P1)
-
-As a QA engineer, I want to write terminal UI tests using an PlaywrightAPI-style
-test framework with clear steps and expectations, and run them from the
-command line, so I can validate a TUI application
-with readable, repeatable tests.
-
-**Why this priority**: Enables the MVP: test authoring and execution.
-
-**Independent Test**: A user can write one test for a sample TUI flow and run
-it from the CLI to get a pass/fail result.
-
-**Acceptance Scenarios**:
-
-1. **Given** a TUI test case, **When** I define a test with steps and
- assertions and run it, **Then** I receive a clear pass/fail result.
-2. **Given** a failing expectation, **When** I run the test, **Then** the
- output explains the failure and the last observed terminal state.
-
----
-
-### User Story 2 - Parallel Test Execution (Priority: P2)
-
-As a test maintainer, I want to run multiple tests in parallel with reliable
-isolation, so I can reduce total execution time without flaky interference.
-
-**Why this priority**: Improves usability and scalability for real projects.
-
-**Independent Test**: A user can run a suite with multiple tests and observe
-parallel execution with consistent results.
-
-**Acceptance Scenarios**:
-
-1. **Given** a suite with multiple tests, **When** I enable parallel runs,
- **Then** the total runtime decreases versus serial execution.
-2. **Given** parallel runs, **When** one test fails, **Then** other tests
- continue and their results are reported independently.
-
----
-
-### User Story 3 - Maintainable Test Organization (Priority: P3)
-
-As a team lead, I want tests to be organized and reusable, so the suite stays
-maintainable as it grows.
-
-**Why this priority**: Supports long-term adoption and team scalability.
-
-**Independent Test**: A user can group tests by suite, reuse common steps, and
-still run a single suite independently.
-
-**Acceptance Scenarios**:
-
-1. **Given** shared steps, **When** I apply them across multiple tests,
- **Then** the tests remain readable and consistent.
-
----
-
-## MVP Scope *(mandatory)*
-
-### In Scope (MVP)
-
-- An PlaywrightAPI-style authoring interface for TUI test creation (steps,
- waits,assertions, expectations)
-- CLI execution of tests with clear pass/fail output
-- Basic failure diagnostics with terminal output capture
-- The test process can be recorded and replayed.
-
-### Out of Scope (Deferred)
-
-- Distributed execution across multiple machines
-
-### Non-Goals
-
-- Full automation for non-terminal graphical apps
-- Replacing existing unit-test frameworks
-
-### Edge Cases
-
-- Test hangs due to missing output or waiting conditions
-- TUI app crashes or exits unexpectedly mid-test
-- Parallel tests contend for terminal resources
-- Non-deterministic output timing causing flaky assertions
-
-## Assumptions
-
-- Users run tests from a local CLI in a standard terminal environment.
-- The framework will target common terminal behaviors rather than emulator-
- specific quirks.
-- Default timeouts are acceptable for most tests but can be configured.
-
-## Dependencies
-
-- Access to a runnable TUI application under test
-- Ability to launch and terminate the app from the CLI
-- A terminal environment that supports standard input/output
-
-## Requirements *(mandatory)*
-
-### Functional Requirements
-
-- **FR-001**: The system MUST allow users to define test suites and test cases
- using an PlaywrightAPI-style test framework for TUI interactions.
-- **FR-002**: The system MUST support core terminal actions (send input, wait
- for output, assert screen content).
-- **FR-003**: The system MUST run tests from a CLI and return a non-zero exit
- code on failure.
-- **FR-004**: The system MUST support parallel execution with isolation so
- tests do not interfere with each other.
-- **FR-005**: The system MUST produce human-readable results with per-test
- pass/fail status and summary counts.
-- **FR-006**: The system MUST capture failure diagnostics that include the
- last observed terminal output and the failed expectation.
-- **FR-007**: The system MUST allow users to configure timeouts at the suite
- or test level.
-- **FR-008**: Users MUST be able to organize tests into suites and reuse
- shared steps across tests.
-
-### Key Entities *(include if feature involves data)*
-
-- **Test Suite**: A named collection of tests with shared configuration.
-- **Test Case**: An ordered set of steps and assertions for a TUI flow.
-- **Step**: A single interaction or expectation within a test case.
-- **Run Result**: The status and timing for a test run (pass/fail, duration).
-- **Artifact**: Captured terminal output associated with a run or failure.
-
-## Success Criteria *(mandatory)*
-
-### Measurable Outcomes
-
-- **SC-001**: New users can author and run a basic TUI test within 30 minutes.
-- **SC-002**: Parallel execution reduces total runtime by at least 40% for a
- suite of 10 or more tests compared to serial execution.
-- **SC-003**: 95% of test failures include actionable diagnostics that identify
- the failing step and terminal state.
-- **SC-004**: 90% of users report that test suites remain readable and
- maintainable after adding 20+ tests.
diff --git a/specs/001-tui-test-framework/tasks.md b/specs/001-tui-test-framework/tasks.md
deleted file mode 100644
index 73788b8..0000000
--- a/specs/001-tui-test-framework/tasks.md
+++ /dev/null
@@ -1,240 +0,0 @@
----
-
-description: "Task list for CLI/TUI test framework"
----
-
-# Tasks: CLI/TUI Test Framework
-
-**Input**: Design documents from `/specs/001-tui-test-framework/`
-**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/
-
-**Tests**: Tests are REQUIRED for every user story (TDD). Write tests first and ensure they fail before implementation.
-
-**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
-
-## Format: `[ID] [P?] [Story] Description`
-
-- **[P]**: Can run in parallel (different files, no dependencies)
-- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
-- Include exact file paths in descriptions
-
-## Phase 1: Setup (Shared Infrastructure)
-
-**Purpose**: Project initialization and basic structure
-
-- [X] T001 Create `package.json` with Node 20.11.0 engines, scripts, and deps for TypeScript + node-pty
-- [X] T002 Add TypeScript ESM config in `tsconfig.json` with `src/` → `dist/` output
-- [X] T003 [P] Create public entrypoint scaffold in `src/index.ts`
-- [X] T004 [P] Add npm tooling configs in `eslint.config.js` and `.prettierrc`
-
----
-
-## Phase 2: Foundational (Blocking Prerequisites)
-
-**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
-
-**⚠️ CRITICAL**: No user story work can begin until this phase is complete
-
-- [X] T005 Define core entities (TestSuite, TestCase, Step, RunResult, Artifact) in `src/runner/models.ts`
-- [X] T006 [P] Implement run configuration loader (timeouts, record, parallel) in `src/runner/config.ts`
-- [X] T007 [P] Implement artifact directory manager and path helpers in `src/runner/artifacts.ts`
-- [X] T008 Implement test file discovery and loading in `src/runner/loader.ts`
-- [X] T009 Implement terminal session abstraction around node-pty in `src/terminal/session.ts`
-
-**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
-
----
-
-## Phase 3: User Story 1 - Write and Run TUI Tests (Priority: P1) 🎯 MVP
-
-**Goal**: Playwright-style authoring, CLI execution, clear pass/fail output, and recording support.
-
-**Independent Test**: A user can author a single test and run `repterm test` to see a pass/fail result with terminal output on failure.
-
-### Tests for User Story 1 (REQUIRED) ⚠️
-
-> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
-
-- [X] ~~T010 [P] [US1] Contract test for POST `/runs` (REMOVED - API server not needed)~~
-- [X] ~~T011 [P] [US1] Contract test for GET `/runs/{runId}` (REMOVED - API server not needed)~~
-- [X] ~~T012 [P] [US1] Contract test for GET `/runs/{runId}/artifacts` (REMOVED - API server not needed)~~
-- [X] T013 [P] [US1] Integration test for CLI single test run in `tests/integration/cli-run.test.ts`
-- [X] T014 [P] [US1] Integration test for recording mode in `tests/integration/recording-run.test.ts`
-
-### Implementation for User Story 1
-
-- [X] T015 [P] [US1] Implement `test()` registration and suite registry in `src/api/test.ts`
-- [X] T016 [P] [US1] Implement `expect()` terminal matchers in `src/api/expect.ts`
-- [X] T017 [P] [US1] Implement `Terminal` API (start/send/wait/snapshot) in `src/terminal/terminal.ts`
-- [X] T018 [US1] Implement single-runner execution pipeline in `src/runner/runner.ts`
-- [X] T019 [US1] Implement CLI command parsing + exit codes in `src/cli/index.ts`
-- [X] T020 [US1] Implement reporter with failure diagnostics in `src/cli/reporter.ts`
-- [X] T021 [US1] Implement recording mode (asciinema + tmux) in `src/recording/recorder.ts` and `src/terminal/tmux.ts`
-- [X] T022 [US1] Implement multi-pane `terminalFactory` in `src/terminal/factory.ts`
-- [X] T023 [US1] Export public API surface in `src/index.ts`
-- [X] ~~T024 [US1] Implement run status store and API handlers (REMOVED - API server not needed)~~
-- [X] ~~T025 [US1] Add HTTP server for `/runs` endpoints (REMOVED - API server not needed)~~
-
-**Checkpoint**: User Story 1 should be fully functional and testable independently
-
----
-
-## Phase 4: User Story 2 - Parallel Test Execution (Priority: P2)
-
-**Goal**: Parallel runs with isolated terminal state and independent reporting.
-
-**Independent Test**: A suite with multiple tests runs with `--workers` and shows concurrent execution with independent results.
-
-### Tests for User Story 2 (REQUIRED) ⚠️
-
-> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
-
-- [X] T026 [P] [US2] Integration test for parallel worker run in `tests/integration/parallel-run.test.ts`
-- [X] T027 [P] [US2] Unit test for scheduler aggregation in `tests/unit/scheduler.test.ts`
-
-### Implementation for User Story 2
-
-- [X] T028 [P] [US2] Implement worker process runner in `src/runner/worker.ts`
-- [X] T029 [US2] Implement scheduler + aggregation in `src/runner/scheduler.ts`
-- [X] T030 [US2] Add CLI `--workers` flag and config wiring in `src/cli/index.ts`
-- [X] T031 [US2] Ensure per-worker artifact isolation in `src/runner/artifacts.ts`
-
-**Checkpoint**: User Stories 1 and 2 should both work independently
-
----
-
-## Phase 5: User Story 3 - Maintainable Test Organization (Priority: P3)
-
-**Goal**: Suites, shared steps, and reusable fixtures that keep tests readable.
-
-**Independent Test**: A user can define suites and shared steps, then run a single suite independently.
-
-### Tests for User Story 3 (REQUIRED) ⚠️
-
-> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
-
-- [X] T032 [P] [US3] Unit test for suite grouping and steps in `tests/unit/describe-steps.test.ts`
-- [X] T033 [P] [US3] Integration test for shared fixtures and suite filtering in `tests/integration/fixtures-suites.test.ts`
-
-### Implementation for User Story 3
-
-- [X] T034 [P] [US3] Implement `test.describe` suite grouping in `src/api/describe.ts`
-- [X] T035 [P] [US3] Implement `test.step` with step reporting in `src/api/steps.ts`
-- [X] T036 [P] [US3] Implement hooks/fixtures (`beforeEach`, `afterEach`) in `src/api/hooks.ts`
-- [X] T037 [US3] Add suite filtering by name/pattern in `src/runner/loader.ts`
-- [X] T038 [US3] Bind shared fixtures into execution context in `src/runner/runner.ts`
-
-**Checkpoint**: All user stories should now be independently functional
-
----
-
-## Phase 6: Polish & Cross-Cutting Concerns
-
-**Purpose**: Improvements that affect multiple user stories
-
-- [X] T039 [P] Add CLI help text and dependency checks in `src/cli/index.ts`
-- [X] T040 [P] Add timing utilities for performance tracking in `src/utils/timing.ts`
-- [X] T041 [P] Update `specs/001-tui-test-framework/quickstart.md` with validated examples
-
----
-
-## Dependencies & Execution Order
-
-### Phase Dependencies
-
-- **Setup (Phase 1)**: No dependencies - can start immediately
-- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
-- **User Stories (Phase 3+)**: All depend on Foundational phase completion
- - User stories can then proceed in parallel (if staffed)
- - Or sequentially in priority order (P1 → P2 → P3)
-- **Polish (Final Phase)**: Depends on all desired user stories being complete
-
-### User Story Dependencies
-
-- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - no dependencies
-- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - integrates with runner/CLI
-- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - builds on suite registry
-
-### Within Each User Story
-
-- Tests MUST be written and FAIL before implementation
-- Public API definitions before runner integration
-- Terminal wiring before step execution
-- Runner execution before CLI reporting
-- Story complete before moving to next priority
-
-### Parallel Opportunities
-
-- All Setup tasks marked [P] can run in parallel
-- All Foundational tasks marked [P] can run in parallel (within Phase 2)
-- Once Foundational is complete, user stories can be worked in parallel
-- API/terminal tasks within a story marked [P] can run in parallel
-
----
-
-## Parallel Example: User Story 1
-
-```bash
-Task: "Implement test() registration and suite registry in src/api/test.ts"
-Task: "Implement expect() terminal matchers in src/api/expect.ts"
-Task: "Implement Terminal API in src/terminal/terminal.ts"
-```
-
----
-
-## Parallel Example: User Story 2
-
-```bash
-Task: "Implement worker process runner in src/runner/worker.ts"
-Task: "Ensure per-worker artifact isolation in src/runner/artifacts.ts"
-```
-
----
-
-## Parallel Example: User Story 3
-
-```bash
-Task: "Implement test.describe suite grouping in src/api/describe.ts"
-Task: "Implement test.step with step reporting in src/api/steps.ts"
-Task: "Implement hooks/fixtures in src/api/hooks.ts"
-```
-
----
-
-## Implementation Strategy
-
-### MVP First (User Story 1 Only)
-
-1. Complete Phase 1: Setup
-2. Complete Phase 2: Foundational (CRITICAL - blocks all stories)
-3. Complete Phase 3: User Story 1
-4. **STOP and VALIDATE**: Test User Story 1 independently
-5. Demo MVP (CLI run + diagnostics + recording)
-
-### Incremental Delivery
-
-1. Complete Setup + Foundational → Foundation ready
-2. Add User Story 1 → Test independently → MVP
-3. Add User Story 2 → Test independently
-4. Add User Story 3 → Test independently
-5. Each story adds value without breaking previous stories
-
-### Parallel Team Strategy
-
-With multiple developers:
-
-1. Team completes Setup + Foundational together
-2. Once Foundational is done:
- - Developer A: User Story 1
- - Developer B: User Story 2
- - Developer C: User Story 3
-3. Stories complete and integrate independently
-
----
-
-## Notes
-
-- [P] tasks = different files, no dependencies
-- [Story] label maps task to specific user story for traceability
-- Each user story is independently completable and testable
-- Avoid vague tasks; include exact file paths