GitHub - sherifkozman/the-red-council: LLM Adversarial Security Arena — Jailbreak → Detect → Defend → Verify

AI Red Team & Security — Attack. Assess. Patch.

What is The Red Council?

The Red Council is an automated adversarial testing platform for Large Language Models. It implements a closed-loop security workflow that identifies vulnerabilities, generates automated defenses, and verifies their effectiveness in real-time.

It leverages Gemini 3 Pro for attack generation, judging, and defense.

Core Loop

Attack: Red Team agent generates adversarial prompts using a Knowledge Base of 165+ curated artifacts.
Judge: Impartial evaluator scores the target's response for security breaches (secret leakage, policy violations).
Defend: If a breach is detected, the Blue Team agent automatically hardens the target's system prompt.
Verify: The orchestrator re-runs the attack against the hardened model to prove the fix works.

Key Features

Multi-Agent Adversarial Flow: Orchestrated via LangGraph.
Real-time Battle UI: Live attack visualization using Next.js 14 and Tailwind.
RAG-Enhanced Attacks: Knowledge Base curated from HarmBench and PyRIT datasets.
Production API: Hardened FastAPI backend with SSE streaming.
Universal Configuration: Support for any LLM endpoint (OpenAI, Anthropic, Vertex, Local).
OpenClaw Integration: Test OpenClaw agents as a skill (docs).

Quickstart

Prerequisites

Python 3.11+
Node.js 18+ (for frontend)
Google Cloud credentials (for Vertex AI access)

Installation

# 1. Clone
git clone https://github.com/sherifkozman/the-red-council.git
cd the-red-council

# 2. Setup Backend
python -m venv venv
source venv/bin/activate

# Basic installation (core functionality)
pip install -e .

# Or with framework integrations:
pip install -e ".[langchain]"      # LangChain integration
pip install -e ".[langgraph]"      # LangGraph integration
pip install -e ".[mcp]"            # MCP protocol integration
pip install -e ".[all-frameworks]" # All framework integrations

# Development dependencies (for contributing)
pip install -e ".[dev]"

# Seed the knowledge base
python -m scripts.seed_kb

# 3. Setup Frontend
cd frontend
pnpm install

Installation Options

The Red Council supports optional dependencies for framework integrations:

Extra	Install Command	Description
Core	`pip install -e .`	Core functionality, UI, and API
langchain	`pip install -e ".[langchain]"`	LangChain agent integration
langgraph	`pip install -e ".[langgraph]"`	LangGraph workflow integration
mcp	`pip install -e ".[mcp]"`	MCP protocol integration
all-frameworks	`pip install -e ".[all-frameworks]"`	All framework integrations
dev	`pip install -e ".[dev]"`	Development tools (pytest, ruff, mypy)

Note: Framework extras are optional. The core package works without any framework integration installed.

Running the Arena

# Terminal 1: API Backend
uvicorn src.api.main:app --port 8000

# Terminal 2: Tactical UI
cd frontend && pnpm dev

Open http://localhost:3000 to start your first campaign.

Agent Security Testing (v0.5.0)

The Red Council v0.5.0 extends beyond pure LLM testing to support AI Agent Security Testing using the OWASP Agentic Top 10 vulnerability framework.

Agent Testing Features

InstrumentedAgent SDK: Wrap any agent to capture tool calls, memory access, and actions
OWASP Agentic Top 10: Test for all 10 agent-specific vulnerabilities (ASI01-ASI10)
Framework Integrations: Native support for LangChain, LangGraph, and MCP protocol
Security Reports: Detailed vulnerability findings with remediation guidance

Quick Example

from src.agents.instrumented import InstrumentedAgent
from src.core.agent_schemas import AgentInstrumentationConfig
from src.agents.agent_judge import AgentJudge, AgentJudgeConfig

# 1. Configure instrumentation
config = AgentInstrumentationConfig(
    enable_tool_interception=True,
    enable_memory_monitoring=True,
    divergence_threshold=0.5,
)

# 2. Wrap your agent
instrumented = InstrumentedAgent(my_agent, "test-agent", config)

# 3. Run your agent (events are automatically captured)
with instrumented:
    result = instrumented.wrap_tool_call("search", search_func, query="test")

# 4. Evaluate for security vulnerabilities
judge = AgentJudge()
score = judge.evaluate_agent(instrumented.events)

print(f"Risk Score: {score.overall_agent_risk}/10")
for violation in score.owasp_violations:
    if violation.detected:
        print(f"  {violation.owasp_category}: {violation.evidence}")

Framework Integrations

# LangChain
from src.integrations import LangChainAgentWrapper
wrapped = LangChainAgentWrapper.from_agent_executor(my_executor, config)

# LangGraph
from src.integrations import LangGraphAgentWrapper
wrapped = LangGraphAgentWrapper.from_state_graph(my_graph, config)

# MCP Protocol
from src.integrations import MCPAgentWrapper
wrapped = await MCPAgentWrapper.from_stdio_server(["python", "server.py"], config)

API Endpoints

Agent testing is available via REST API:

# Create a testing session
curl -X POST http://localhost:8000/api/v1/agent/session \
  -H "Content-Type: application/json" \
  -d '{"context": "Agent under test"}'

# Submit events
curl -X POST http://localhost:8000/api/v1/agent/session/{session_id}/events \
  -H "Content-Type: application/json" \
  -d '{"events": [{"event_type": "tool_call", "tool_name": "search", ...}]}'

# Run evaluation
curl -X POST http://localhost:8000/api/v1/agent/session/{session_id}/evaluate

# Get security report
curl http://localhost:8000/api/v1/agent/session/{session_id}/report

See Agent Testing Guide for comprehensive documentation.

Documentation

License

MIT - See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github/workflows		.github/workflows
data/demo		data/demo
docker		docker
docs		docs
examples/agent_testing		examples/agent_testing
frontend		frontend
integrations		integrations
public		public
scripts		scripts
src		src
tests		tests
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is The Red Council?

Core Loop

Key Features

Quickstart

Prerequisites

Installation

Installation Options

Running the Arena

Agent Security Testing (v0.5.0)

Agent Testing Features

Quick Example

Framework Integrations

API Endpoints

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is The Red Council?

Core Loop

Key Features

Quickstart

Prerequisites

Installation

Installation Options

Running the Arena

Agent Security Testing (v0.5.0)

Agent Testing Features

Quick Example

Framework Integrations

API Endpoints

Documentation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages