Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "buidl",
"version": "3.4.0",
"description": "Full dev lifecycle for OP_NET Bitcoin L1 projects: idea → challenge → spec → build → review → ship. Includes the OP_NET Bible (2000+ lines of rules, patterns, and known mistakes) that prevents AI agents from hallucinating package versions, using forbidden patterns, or shipping exploitable code. Now with shell-enforced E2E testing gates, frontend runtime smoke checks, pre-flight anti-pattern scanning, PUA exhaustive problem-solving methodology, and GSD-2 debugging discipline — agents never skip testing or ship buggy frontends.",
"version": "3.5.0",
"description": "Full dev lifecycle for OP_NET Bitcoin L1 projects: idea → challenge → spec → build → review → ship. Self-learning across sessions with pattern extraction, agent performance scoring, cross-layer validation, and starter templates. Includes shell-enforced E2E testing gates, frontend runtime smoke checks, PUA problem-solving methodology, and the OP_NET Bible (2000+ lines). Agents get smarter with every project.",
"author": {
"name": "dannyplainview + bob"
}
Expand Down
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,31 @@
# Changelog

## [3.5.0] - 2026-03-13

### Added
- **Adaptive learning pattern store** (`learning/patterns.yaml`): Structured YAML store for patterns auto-extracted from session retrospectives. Patterns are categorized by domain (contract/frontend/backend/deployment/testing), failure type, and tech stack. Auto-deduplicates and tracks occurrence count across sessions.
- **Pattern extraction script** (`scripts/extract-patterns.sh`): Reads a retrospective markdown file, extracts anti-patterns and failures, appends structured entries to patterns.yaml. Auto-promotes patterns with 3+ occurrences to relevant knowledge slices with `[LEARNED]` tag.
- **Agent performance scoring** (`learning/agent-scores.yaml`): Rolling metrics per agent — sessions completed, success rate, average cycles to pass review, average tokens consumed, model history with per-model success rates. Updated automatically after each session.
- **Score update script** (`scripts/update-scores.sh`): Reads session state.yaml after completion, extracts agent outcomes, computes rolling averages, updates agent-scores.yaml.
- **Cross-layer validator agent** (`agents/cross-layer-validator.md`): READ-ONLY agent that validates integration correctness across contract/frontend/backend layers. Checks ABI-to-frontend method mapping, parameter types, contract address consistency, network config alignment, signer configuration, and event names. Runs after builders, before auditor.
- **Cross-layer validation knowledge slice** (`knowledge/slices/cross-layer-validation.md`): 8 documented mismatch types with detection rules, fixes, and routing decisions. Validation checklist for frontend/backend contract calls.
- **OP-20 starter template** (`templates/starters/op20-token/`): Complete starter for OP-20 token projects — AssemblyScript contract with parameterized name/symbol/supply, unit tests, OPNet-ready Vite frontend with WalletConnect scaffold, and template.yaml manifest with customization points.
- **`validating` active phase**: Added to stop-hook, guard-state, and guard-state-bash so the loop stays blocked during cross-layer validation.

### Changed
- **Orchestrator Phase 4 Step 0** (`commands/buidl.md`): Learning consultation now has 4 sub-steps — (a) query pattern store filtered by project type, (b) check agent scores and suggest model upgrades for underperforming agents, (c) read retrospectives, (d) check starter templates for matching project type.
- **Orchestrator Phase 4 Step 2b.5** (`commands/buidl.md`): New cross-layer validation step between builders and auditor. Dispatches cross-layer-validator, routes MISMATCH findings to responsible agents, passes WARNING findings to auditor.
- **Orchestrator Phase 6** (`commands/buidl.md`): After retrospective, now calls extract-patterns.sh and update-scores.sh to update the adaptive learning system.
- **Auditor dispatch** (`commands/buidl.md`): Now imports cross-layer validation report as additional context.
- **Plugin version**: 3.4.0 -> 3.5.0

### Deferred to v3.6
- **Score-based routing** (US-6): Routing reviewer findings to agents based on historical success rates. Requires more data points before it's useful.
- **Project-type profiles** (US-8): Auto-generated profiles after 5+ projects of the same type. Needs accumulation of pattern data first.

### Why
The plugin had a learning system that saved retrospectives but barely used them — the orchestrator read them as advisory text with no structure, no indexing, and no feedback loop into agent prompts. Agents kept repeating the same mistakes across sessions. The pattern store + agent scoring creates a real feedback loop: every session's lessons are extracted, scored, and injected into future agent prompts. Cross-layer validation catches the #1 source of wasted audit/E2E cycles (ABI mismatches) before they reach expensive downstream agents. Starter templates eliminate boilerplate for the most common project type (OP-20 tokens).

## [3.4.0] - 2026-03-13

### Added
Expand Down
55 changes: 47 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![Plugin Tests](https://github.com/bc1plainview/buidl-opnet-plugin/actions/workflows/plugin-tests.yml/badge.svg)](https://github.com/bc1plainview/buidl-opnet-plugin/actions/workflows/plugin-tests.yml)

A Claude Code plugin that turns a single prompt into a production-ready, audited, deployed, and on-chain tested application. 11 specialized agents handle smart contract development, frontend, backend, security audit, deployment, real on-chain E2E testing, UI testing, and code review — coordinated by an orchestrator that manages the full lifecycle from idea to merged PR.
A Claude Code plugin that turns a single prompt into a production-ready, audited, deployed, and on-chain tested application. 12 specialized agents handle smart contract development, frontend, backend, security audit, cross-layer validation, deployment, real on-chain E2E testing, UI testing, and code review — coordinated by an orchestrator that manages the full lifecycle from idea to merged PR.

Built for OPNet (Bitcoin L1 smart contracts), but the core loop system works for any project. Non-OPNet projects get dynamic agent generation from templates.

Expand Down Expand Up @@ -101,6 +101,35 @@ alias claudeyproj="claude --dangerously-skip-permissions --plugin-dir /path/to/b
| `loop-explorer` | Codebase structure mapping and relevance analysis |
| `loop-researcher` | Web search for existing solutions (build vs buy gate) |
| `loop-reviewer` | PR review against spec + pattern checklist |
| `cross-layer-validator` | READ-ONLY ABI-to-frontend/backend integration validation |

## Adaptive Learning System (v3.5)

The plugin learns from every session and gets smarter over time.

### Pattern Store (`learning/patterns.yaml`)
- Anti-patterns and failures from retrospectives are extracted into a structured YAML store
- Deduplicated by description similarity; occurrence counts tracked across sessions
- Patterns with 3+ occurrences auto-promote to relevant knowledge slices with `[LEARNED]` tags
- Grep-queryable by category, tech stack, failure type

### Agent Performance Scoring (`learning/agent-scores.yaml`)
- Rolling averages for success rate, cycles to pass, and tokens consumed per agent
- Per-model breakdowns (opus vs sonnet performance tracking)
- Scores require 5+ data points before surfacing in `/buidl-status`
- Orchestrator consults scores to inform agent dispatch order

### Cross-Layer Validator
- Validates ABI-to-frontend method mapping, parameter types, contract addresses, network config
- Runs after all builders but before the auditor — catches integration mismatches early
- 8 mismatch types with detection rules and routing decisions
- READ-ONLY agent (cannot modify files)

### Starter Templates (`templates/starters/`)
- Pre-built project scaffolds for common OPNet patterns (OP-20 token included)
- Template manifests with customization points (token name, symbol, decimals, features)
- Includes contract, tests, frontend, hooks, and build config
- Orchestrator detects matching templates and offers them during spec phase

## Enforcement Mechanisms

Expand Down Expand Up @@ -158,6 +187,7 @@ knowledge/
+-- ui-testing.md # Playwright setup, visual regression, wallet mocking
+-- transaction-simulation.md # Simulation patterns for all agent types
+-- integration-review.md # Cross-layer review patterns
+-- cross-layer-validation.md # ABI-to-frontend/backend validation rules
+-- project-setup.md # OPNet project scaffolding
```

Expand Down Expand Up @@ -250,6 +280,9 @@ If the loop is interrupted (context exhaustion, wall-clock timeout, manual cance
| Wall-clock timeout | v3.0 | Configurable max duration (default 60 min). Graceful save on timeout. |
| Cost tracking | v3.0 | Token spend per agent in cost-ledger.md. Budget enforcement with --max-tokens. |
| Learning system | v3.0 | Retrospectives saved to learning/. Future sessions consult past lessons. |
| Adaptive learning | v3.5 | Pattern extraction, agent scoring, auto-promotion to knowledge slices. |
| Cross-layer validation | v3.5 | ABI-to-frontend/backend integration checking between build and audit. |
| Starter templates | v3.5 | Pre-built scaffolds for OP-20 tokens (more planned). |
| Dynamic agents | v3.0 | Non-OPNet projects generate domain agents from templates. |
| On-chain E2E testing | v3.2 | Real transactions with test wallets. Every method tested. Multi-wallet flows. |
| PUA methodology | v3.3 | Exhaustive problem-solving with anti-rationalization and pressure escalation. |
Expand All @@ -263,21 +296,24 @@ If the loop is interrupted (context exhaustion, wall-clock timeout, manual cance
```
buidl/
+-- .claude-plugin/
| +-- plugin.json # Plugin manifest (v3.4.0)
+-- agents/ # 11 agent definitions
| +-- plugin.json # Plugin manifest (v3.5.0)
+-- agents/ # 12 agent definitions (incl. cross-layer-validator)
+-- commands/ # 7 slash commands
+-- hooks/ # Stop hook + state guards
| +-- scripts/
+-- knowledge/ # OPNet reference + domain slices
| +-- slices/ # 10 knowledge slices
+-- learning/ # Retrospectives from past sessions
+-- scripts/ # Setup + atomic state writer
+-- learning/ # Patterns, agent scores, retrospectives
| +-- patterns.yaml # Structured pattern store (auto-updated)
| +-- agent-scores.yaml # Agent performance metrics (auto-updated)
+-- scripts/ # Setup + atomic state writer + learning scripts
+-- skills/ # 3 triggerable skills
| +-- audit-from-bugs/
| +-- loop-guide/
| +-- pua/
+-- templates/ # Domain agent + knowledge slice templates
+-- tests/ # 235 structural + integration tests
+-- templates/ # Domain agent, knowledge slice, starter templates
| +-- starters/ # Project scaffolds (op20-token, more planned)
+-- tests/ # 272 structural + integration tests
```

## Testing
Expand All @@ -286,7 +322,7 @@ buidl/
bash tests/plugin-tests.sh
```

235 tests across 23 categories:
272 tests across 26 categories:

| Category | What it checks |
|----------|----------------|
Expand All @@ -313,6 +349,9 @@ bash tests/plugin-tests.sh
| Learning pruning | Cap at 20 retrospectives |
| Orphan worktrees | Detection in status, cleanup in clean |
| Guard-state-bash | Bash tool redirect blocking |
| Adaptive learning | Pattern store schema, extraction scripts, agent scores format |
| Cross-layer validator | Agent definition, knowledge slice, mismatch type coverage |
| Starter templates | Template manifest, contract template, frontend template, hook files |

Tests run automatically on every push and PR via GitHub Actions.

Expand Down
152 changes: 152 additions & 0 deletions agents/cross-layer-validator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
name: cross-layer-validator
description: |
Use this agent during Phase 4 of /buidl after all builders finish but BEFORE the auditor. This is the integration validation specialist -- it checks that contract ABIs match frontend/backend calls, addresses are consistent, and network configs align across all layers. It is READ-ONLY and cannot modify any files.

<example>
Context: Contract-dev and frontend-dev have both finished. Time to validate integration.
user: "All builders done. Run cross-layer validation before audit."
assistant: "Launching the cross-layer validator to check ABI-to-frontend method mapping."
<commentary>
Validator runs AFTER all builders but BEFORE auditor. Catches mismatches early.
</commentary>
</example>

<example>
Context: Frontend-dev called a contract method that doesn't exist in the ABI.
user: "Validator found ABI mismatch. Route to frontend-dev."
assistant: "Launching frontend-dev to fix the contract call."
<commentary>
Validator findings are routed to the responsible builder for fixes before audit.
</commentary>
</example>
model: sonnet
color: cyan
tools:
- Read
- Grep
- Glob
---

You are the **Cross-Layer Validator** agent. You check integration correctness across contract, frontend, and backend layers.

## Constraints

- You are READ-ONLY. You do NOT modify any files.
- You do NOT write contracts, frontend code, backend code, or deployment scripts.
- You validate that layers are consistent with each other, not that individual layers are correct (that's the auditor's job).

### FORBIDDEN
- Writing or editing any source file.
- Modifying state files, artifacts, or configuration.
- Running build commands or tests.
- Making network requests or RPC calls.

## Step 0: Read Your Knowledge (MANDATORY)

Before any validation:
1. Read [knowledge/slices/cross-layer-validation.md](knowledge/slices/cross-layer-validation.md) COMPLETELY.
2. If you encounter issues, check [knowledge/opnet-troubleshooting.md](knowledge/opnet-troubleshooting.md).

## Process

### Step 1: Inventory All Layers

Identify which layers exist in this build:
- **Contract**: Check for ABI JSON in `artifacts/contract/abi.json` or similar
- **Frontend**: Check for `src/` with React/TypeScript files (`.tsx`, `.ts`)
- **Backend**: Check for server files, API routes, or `backend/` directory

If only one layer exists, report "Single-layer project — no cross-layer validation needed" and exit.

### Step 2: Parse the Contract ABI

Read the ABI JSON and extract:
- All public method names and their selectors
- Parameter types for each method (input and output)
- Event definitions
- Whether methods are read-only or state-changing

### Step 3: Validate Frontend-to-Contract Integration

For each frontend file that imports or uses contract interactions:

**Check 3a — Method Existence:**
- Every `contract.methodName()` call in frontend must exist in the ABI
- Report any call to a method not in the ABI

**Check 3b — Parameter Types:**
- Parameter count must match between frontend call and ABI definition
- BigInt must be used for uint256 params (not number)
- Address params must use Address type (not raw string)

**Check 3c — Contract Address Consistency:**
- Contract address used in `getContract()` must match deployment config
- If hardcoded, flag as warning (should come from config/env)

**Check 3d — Network Consistency:**
- Frontend must use the same network as the contract deployment
- Check for `networks.opnetTestnet` vs `networks.testnet` mismatch

### Step 4: Validate Backend-to-Contract Integration (if backend exists)

Same checks as Step 3, applied to backend code:
- RPC URL consistency
- Method calls match ABI
- Signer usage (backend MUST have signer, frontend MUST NOT)

### Step 5: Validate Frontend-to-Backend Integration (if both exist)

- API endpoint URLs in frontend match backend route definitions
- Shared types/interfaces are consistent
- Authentication patterns align

### Step 6: Output Validation Report

Output your validation report as your final message (the orchestrator will save it to `artifacts/validation/cross-layer-report.md`):

```markdown
# Cross-Layer Validation Report

## Layers Validated
- Contract: [yes/no]
- Frontend: [yes/no]
- Backend: [yes/no]

## Findings

### MISMATCH (route to responsible agent)
- [CLV-001] Frontend calls `contract.stake()` but ABI has no `stake` method
- File: src/hooks/useStaking.ts:42
- Route to: frontend-dev (fix the method call) or contract-dev (add the method)

### WARNING (inform auditor)
- [CLV-002] Contract address hardcoded in src/config.ts instead of env variable
- File: src/config.ts:12

### PASS
- [CLV-003] All 8 frontend contract calls map to valid ABI methods
- [CLV-004] Network config consistent across all layers (opnetTestnet)

## Summary
Total checks: N
Passed: N
Mismatches: N (must be fixed before audit)
Warnings: N (informational)
```

## Output Format

Output your findings as your final response text. The orchestrator saves the report file.
- MISMATCH items should be routed to the responsible builder agent.
- WARNING items are passed to the auditor as context.
- PASS items confirm correct integration.

## Rules

1. You are READ-ONLY. Never modify files.
2. Findings are warnings, not blockers — the auditor makes the final call.
3. Check every contract call in frontend/backend, not just a sample.
4. Always report the exact file and line number for each finding.
5. If only one layer exists, skip validation and report "single-layer."
6. Distinguish between "method missing from ABI" (likely a bug) and "method exists but params don't match" (could be intentional).
Loading
Loading