Skip to content

sherifkozman/the-red-council

The Red Council

AI Red Team & Security — Attack. Assess. Patch.

Tests Coverage Python License


What is The Red Council?

The Red Council is an automated adversarial testing platform for Large Language Models. It implements a closed-loop security workflow that identifies vulnerabilities, generates automated defenses, and verifies their effectiveness in real-time.

It leverages Gemini 3 Pro for attack generation, judging, and defense.

Core Loop

  1. Attack: Red Team agent generates adversarial prompts using a Knowledge Base of 165+ curated artifacts.
  2. Judge: Impartial evaluator scores the target's response for security breaches (secret leakage, policy violations).
  3. Defend: If a breach is detected, the Blue Team agent automatically hardens the target's system prompt.
  4. Verify: The orchestrator re-runs the attack against the hardened model to prove the fix works.

Key Features

  • Multi-Agent Adversarial Flow: Orchestrated via LangGraph.
  • Real-time Battle UI: Live attack visualization using Next.js 14 and Tailwind.
  • RAG-Enhanced Attacks: Knowledge Base curated from HarmBench and PyRIT datasets.
  • Production API: Hardened FastAPI backend with SSE streaming.
  • Universal Configuration: Support for any LLM endpoint (OpenAI, Anthropic, Vertex, Local).
  • OpenClaw Integration: Test OpenClaw agents as a skill (docs).

Quickstart

Prerequisites

  • Python 3.11+
  • Node.js 18+ (for frontend)
  • Google Cloud credentials (for Vertex AI access)

Installation

# 1. Clone
git clone https://github.com/sherifkozman/the-red-council.git
cd the-red-council

# 2. Setup Backend
python -m venv venv
source venv/bin/activate

# Basic installation (core functionality)
pip install -e .

# Or with framework integrations:
pip install -e ".[langchain]"      # LangChain integration
pip install -e ".[langgraph]"      # LangGraph integration
pip install -e ".[mcp]"            # MCP protocol integration
pip install -e ".[all-frameworks]" # All framework integrations

# Development dependencies (for contributing)
pip install -e ".[dev]"

# Seed the knowledge base
python -m scripts.seed_kb

# 3. Setup Frontend
cd frontend
pnpm install

Installation Options

The Red Council supports optional dependencies for framework integrations:

Extra Install Command Description
Core pip install -e . Core functionality, UI, and API
langchain pip install -e ".[langchain]" LangChain agent integration
langgraph pip install -e ".[langgraph]" LangGraph workflow integration
mcp pip install -e ".[mcp]" MCP protocol integration
all-frameworks pip install -e ".[all-frameworks]" All framework integrations
dev pip install -e ".[dev]" Development tools (pytest, ruff, mypy)

Note: Framework extras are optional. The core package works without any framework integration installed.

Running the Arena

# Terminal 1: API Backend
uvicorn src.api.main:app --port 8000

# Terminal 2: Tactical UI
cd frontend && pnpm dev

Open http://localhost:3000 to start your first campaign.

Agent Security Testing (v0.5.0)

The Red Council v0.5.0 extends beyond pure LLM testing to support AI Agent Security Testing using the OWASP Agentic Top 10 vulnerability framework.

Agent Testing Features

  • InstrumentedAgent SDK: Wrap any agent to capture tool calls, memory access, and actions
  • OWASP Agentic Top 10: Test for all 10 agent-specific vulnerabilities (ASI01-ASI10)
  • Framework Integrations: Native support for LangChain, LangGraph, and MCP protocol
  • Security Reports: Detailed vulnerability findings with remediation guidance

Quick Example

from src.agents.instrumented import InstrumentedAgent
from src.core.agent_schemas import AgentInstrumentationConfig
from src.agents.agent_judge import AgentJudge, AgentJudgeConfig

# 1. Configure instrumentation
config = AgentInstrumentationConfig(
    enable_tool_interception=True,
    enable_memory_monitoring=True,
    divergence_threshold=0.5,
)

# 2. Wrap your agent
instrumented = InstrumentedAgent(my_agent, "test-agent", config)

# 3. Run your agent (events are automatically captured)
with instrumented:
    result = instrumented.wrap_tool_call("search", search_func, query="test")

# 4. Evaluate for security vulnerabilities
judge = AgentJudge()
score = judge.evaluate_agent(instrumented.events)

print(f"Risk Score: {score.overall_agent_risk}/10")
for violation in score.owasp_violations:
    if violation.detected:
        print(f"  {violation.owasp_category}: {violation.evidence}")

Framework Integrations

# LangChain
from src.integrations import LangChainAgentWrapper
wrapped = LangChainAgentWrapper.from_agent_executor(my_executor, config)

# LangGraph
from src.integrations import LangGraphAgentWrapper
wrapped = LangGraphAgentWrapper.from_state_graph(my_graph, config)

# MCP Protocol
from src.integrations import MCPAgentWrapper
wrapped = await MCPAgentWrapper.from_stdio_server(["python", "server.py"], config)

API Endpoints

Agent testing is available via REST API:

# Create a testing session
curl -X POST http://localhost:8000/api/v1/agent/session \
  -H "Content-Type: application/json" \
  -d '{"context": "Agent under test"}'

# Submit events
curl -X POST http://localhost:8000/api/v1/agent/session/{session_id}/events \
  -H "Content-Type: application/json" \
  -d '{"events": [{"event_type": "tool_call", "tool_name": "search", ...}]}'

# Run evaluation
curl -X POST http://localhost:8000/api/v1/agent/session/{session_id}/evaluate

# Get security report
curl http://localhost:8000/api/v1/agent/session/{session_id}/report

See Agent Testing Guide for comprehensive documentation.

Documentation

License

MIT - See LICENSE for details.

About

LLM Adversarial Security Arena — Jailbreak → Detect → Defend → Verify

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors