Composable skills that turn coding agents into disciplined engineers.
Works with Claude Code, Cursor, Codex, and OpenCode.
Arsyn is a constructed word:
- ars (Latin) — skill, technique, craft
- syn (Greek prefix) — together, integration, synthesis
Arsyn ≈ "synthesis of skills" — a composable library of behavioral skills for AI coding agents.
Arsyn activates automatically. When your agent encounters a task, it doesn't jump into writing code — it checks for relevant skills first. This is enforced, not suggested.
A typical feature build flows through structured phases: ideation → isolation → planning → orchestration → finalization. Each phase is a skill that triggers contextually.
Skills use progressive disclosure — the main SKILL.md stays lean while references/ directories hold detailed guides, templates, and scripts that load only when needed.
- ideating — Explores intent through structured questions. Produces an approved design document.
- isolating — Creates a git worktree on a fresh branch, runs project setup, verifies clean test baseline.
- planning — Decomposes the design into bite-sized tasks with exact file paths, complete code, and verification steps.
- orchestrating or executing — Dispatches subagents per task with review gates, or runs inline with checkpoints.
- tdd — Enforces red-green-refactor: failing test → minimal code → commit. No production code without a failing test.
- requesting-review — Dispatches a code review subagent. Critical issues block progress.
- finalizing — Verifies tests pass, presents options (merge / PR / keep / discard), cleans up the worktree.
Structured requirement exploration before any implementation begins.
How: Interviews you using AskUserQuestion with focused, one-at-a-time questions about requirements, constraints, and success criteria. Presents 2-3 approaches with trade-offs. Produces a design document with confidence-tagged sections.
When: Triggers before any creative work — new features, components, architecture changes, or behavioral modifications. Mandatory gateway to planning.
Bonus: Integrates with the playground plugin for visual decisions (UI layouts, color schemes, architecture diagrams).
Decomposes approved designs into granular, executable task sequences.
How: Breaks work into bite-sized tasks (2-5 min each). Every task has exact file paths, complete copy-pasteable code, verification commands with expected output, and a commit step. Tasks are confidence-tagged (HIGH / MEDIUM / LOW).
When: After a design is approved via ideating. Use when you have a spec or requirements for a multi-step task.
Runs through an implementation plan task-by-task in the current session.
How: Loads a plan file, executes tasks sequentially with checkpoint reviews between batches. Stops on failures and reports what went wrong.
When: After a plan exists and you choose inline execution (vs. orchestrating). Good for smaller plans or when you want direct oversight.
Distributes plan tasks across fresh subagents with two-stage quality gates.
How: Dispatches one subagent per task with isolated context. Each task goes through spec compliance review, then code quality review. Uses TaskCreate/TaskUpdate for inter-agent coordination and file system for shared state.
When: After a plan exists and you choose parallel execution. Recommended for larger plans — fresh context per task prevents drift.
Enforces red-green-refactor discipline on every implementation.
How: Every feature starts with a failing test. You watch it fail, write the minimal code to make it pass, verify it passes, then commit. No production code is written before a test exists for it.
When: During any feature or bugfix implementation. Active throughout the entire coding phase.
Reference: Includes references/testing-anti-patterns.md for common testing mistakes.
Root-cause investigation that eliminates guesswork.
How: Four-phase protocol: observe symptoms → trace to immediate cause → form and test hypotheses → implement targeted fix. Writes findings to a debug-log.md file so you can grep across them instead of losing context.
When: Any bug, test failure, or unexpected behavior — before proposing fixes. Prevents the "try random changes until it works" pattern.
References: references/root-cause-tracing.md, references/defense-in-depth.md, references/condition-based-waiting.md
Enforces evidence-based completion claims.
How: Pre-flight checklist before any "done" claim: run tests, run linter, run build, check requirements, check edge cases, check git status. Every claim maps to a verification command with expected output.
When: Before claiming work is complete, before committing, before creating PRs. Catches "it works on my machine" and "I'm pretty sure it passes" claims.
Creates isolated git worktrees for feature development.
How: Resolves worktree directory (existing > CLAUDE.md directive > prompt), verifies .gitignore coverage, creates worktree on new branch, auto-detects and runs project setup (npm, cargo, pip, go mod, etc.), runs baseline tests.
When: Before starting feature work that needs isolation from the current workspace, or before executing implementation plans.
Reference: references/setup-commands.md for ecosystem-specific install/test commands.
Guides completion of development work.
How: Verifies all tests pass, then presents four options: (1) merge locally and clean up worktree, (2) push and create PR, (3) keep branch as-is for later, (4) discard with typed confirmation. Handles cleanup automatically.
When: When implementation is complete and all tests pass. The last step in the core workflow.
Dispatches a code review subagent to audit completed work.
How: Gathers git SHAs for the work range, dispatches a read-only code-reviewer subagent with structured template. Reviewer categorizes issues as Critical / Important / Minor. Critical issues block progress.
When: After completing a task or feature, before merging. Required between orchestrated subagent tasks.
Reference: references/code-reviewer.md for the review template.
Processes code review feedback with technical rigor.
How: Six-step protocol: read → understand → verify → evaluate → respond → implement. Verifies each suggestion technically before accepting. Rejects performative agreement ("You're absolutely right!"). Applies YAGNI filter to suggested additions.
When: After receiving code review feedback, before implementing suggestions. Especially important when feedback seems unclear or technically questionable.
Dispatches independent tasks to concurrent subagents.
How: One agent per independent problem domain. Each gets a structured prompt (scope, context, constraints, deliverable). Results are reviewed and integrated after all agents complete.
When: Facing 2+ independent tasks that share no state — batch file processing, multi-service updates, independent test suites, or parallel research queries.
Reference: references/agent-prompt-template.md for structuring subagent prompts.
Generates project scaffolds and module templates.
How: Detects existing codebase conventions (framework, language, test runner, lint config), matches to appropriate scaffolding approach, generates with project conventions applied, verifies generated code passes lint and typecheck.
When: Creating new projects, services, modules, or components. Prevents generating boilerplate that clashes with existing patterns.
Reference: references/common-scaffolds.md for templates by pattern type.
Structured incident investigation from symptoms to root cause.
How: Five-phase protocol: intake (capture symptoms, timeline, blast radius) → triage (severity, immediate mitigation) → investigate (evidence gathering, hypothesis testing) → resolve (fix, verify, monitor) → document (post-incident report).
When: Production incidents, service outages, alert responses, or error patterns. Integrates with arsyn:debugging for deep root-cause analysis.
Reference: references/incident-template.md for post-incident report format.
Behavioral guardrails enforced during every implementation task.
How: Five rules checked before presenting any result: (1) reasoning-action coherence — code matches analysis, (2) no fabrication — nothing claimed without verification, (3) full transparency — limitations and unhandled cases reported, (4) no invented rules — cite actual config, (5) honest difficulty — partial solutions labeled as such.
When: Automatically active during any implementation — coding, building, fixing, refactoring, debugging. Always on, not opt-in.
Reference: references/confidence-scale.md for the HIGH/MEDIUM/LOW tagging system.
Research basis: Inspired by Schoen et al. (2025) — "Stress Testing Deliberative Alignment for Anti-Scheming Training". The paper showed that making models explicitly reason about safety rules before acting reduces covert misaligned behavior from ~13% to ~0.4%. Guarding applies this principle to coding: force explicit reasoning about integrity rules before presenting any implementation result.
Post-implementation integrity and quality audit.
How: Checks 5 integrity criteria (coherence, fabrication, omissions, convention compliance, shortcuts) and 4 technical quality criteria (defects, security, efficiency, clarity). Produces structured JSON report with per-criterion PASS/FAIL and overall verdict: APPROVED / APPROVED_WITH_CAVEATS / REJECTED.
When: After implementation is complete. Triggered via /review command or when asked to audit, verify, or check completed work.
Research basis: Complements guarding by auditing the same failure patterns identified in Schoen et al. (2025) — fabrication, omission, and reasoning-action divergence — after implementation rather than during.
Metacognitive calibration for maximum output accuracy.
How: Activates a deliberate verification protocol: pause before output, cross-check each claim against evidence, flag uncertainty explicitly, re-verify the final result before presenting.
When: Production deployments, data migrations, security changes, financial calculations, or any task where errors are costly. Opt-in when the stakes are high.
Creates, tests, and refines agent skills using evaluation-driven development.
How: Applies TDD to skill writing: define failure scenarios first, write the skill, test against scenarios with subagents, iterate. Incorporates Anthropic's official best practices (conciseness, progressive disclosure, degrees of freedom) and Thariq's principles (description-as-trigger, gotchas-first, evaluation-driven).
When: Creating new skills, editing existing skills, or verifying skills work before deployment.
References: references/anthropic-best-practices.md, references/thariq-skill-categories.md, references/search-optimization.md
Session-start skill discovery protocol.
How: Injected into every conversation via the session-start hook. Establishes the core rule: invoke relevant skills before any response. Provides the skill catalog and instruction priority (user > Arsyn > default system prompt).
When: Loaded automatically at session start. You never invoke this manually — it's always present.
| Command | Purpose |
|---|---|
/ideate |
Start structured ideation session |
/plan |
Create implementation plan from spec |
/execute |
Run through a plan task-by-task |
/review |
Run integrity and quality audit |
/ci |
Run lint, typecheck, test, build in parallel |
/pr |
Create a pull request |
/respond |
Address PR review comments |
/deslop |
Clean up AI-generated code patterns |
/careful |
Block dangerous commands until deactivated |
/freeze |
Block edits outside specified directories |
| Agent | Role |
|---|---|
| code-reviewer | Read-only senior code reviewer. Audits completed work against plans and coding standards. |
| project-structure-validator | Validates and fixes project structure against conventions in CLAUDE.md or AGENTS.md. |
# Add the marketplace
/plugin marketplace add renathoaz/arsyn
# Install the plugin
/plugin install arsyn@renathoazgit clone https://github.com/renathoaz/arsyn ~/.cursor/plugins/arsynRestart Cursor to load the plugin.
git clone https://github.com/renathoaz/arsyn ~/.codex/arsyn
mkdir -p ~/.agents/skills
ln -s ~/.codex/arsyn/skills ~/.agents/skills/arsynRestart Codex to discover the skills.
Add to your opencode.json:
{
"plugin": ["arsyn"]
}Restart OpenCode. The plugin auto-installs and registers all skills.
See platform-specific guides: Codex · OpenCode
Built on lessons from Thariq's "How We Use Skills" and the Anthropic skill authoring best practices:
- Progressive disclosure — Skills are folders, not just markdown. Heavy content lives in
references/and loads on demand. - Gotchas-first — The highest-signal content in any skill is what goes wrong. Every skill has a Gotchas section built from real failure patterns.
- Description as trigger — Skill descriptions define when to activate, not what the skill does.
- Don't state the obvious — Only include what pushes the agent beyond its default behavior.
- Don't railroad — Provide information and flexibility. Match specificity to task fragility.
- Evidence over claims — Verify before declaring success.
- Integrity first — No shortcuts, no fabrication, full transparency.
- Schoen, B., Nitishinskaya, E., Balesni, M., et al. (2025). Stress Testing Deliberative Alignment for Anti-Scheming Training. arXiv:2509.15541. https://arxiv.org/abs/2509.15541