A lightweight, real-time local proxy server, token firewall, and cost optimizer for developer AI workflows. Run it locally, point your coding agents (like Cline, Aider, or custom scripts) to it, and stop worrying about runaway token costs or privacy leaks.
If you use AI coding assistants or develop agentic loops, you probably face:
- Bill Shock / Runaway Costs: An infinite recursive loop in your code can call cloud APIs (Claude, OpenAI) repeatedly, charging hundreds of dollars in minutes.
- Privacy Concerns: Your proprietary code is sent directly to external cloud API endpoints.
- Usage Limits: Message limits (e.g. Claude Pro limits) are consumed quickly.
LLM-Guardrail acts as a local interceptor that prevents these issues by:
- Caching identical prompts locally to serve them instantly for $0.00.
- Implementing strict hourly/daily dollar budget caps.
- Detecting infinite loops and request floods, immediately blocking subsequent calls.
- Providing a sleek, real-time developer dashboard to manage settings and monitor logs.
We package two stunning frontends that run directly in your browser:
- Default Dark Mode (
index.html): Glassmorphic cyberpunk command center dashboard. - Warm Light Mode (
llm-guardrail-dashboard.html): Claude-inspired editorial styling. - Both frontends poll stats, render relative logs, toggle cache/budget options, and save configuration rules in real-time. If the backend is offline, they automatically run in a safe Simulation Preview Mode.
- OpenAI compatible route: Intercepts
POST /v1/chat/completions. - Anthropic compatible route: Intercepts
POST /v1/messages. - Budget Enforcer: Blocks requests that exceed your defined hourly/daily limit.
- Loop Prevention: Monitors prompt signatures; if identical prompts are sent repeatedly in a short duration, the proxy blocks them with a
429error. - Response Cache: Stores completions locally in
local_cache.jsonusing SHA-256 signatures, cutting API costs down significantly.
Ensure you have Node.js installed (v18+ recommended).
Clone the repository and install dependencies:
git clone https://github.com/your-username/llm-guardrail.git
cd llm-guardrail
npm installStart the proxy server on http://localhost:8080:
npm startYou can run the precompiled standalone executable directly (perfect for distributing as a single release binary):
./llm-guardrail.exe(This executable is completely self-contained! It embeds the Node.js runtime, dependency modules, and the HTML dashboard UI templates directly inside its binary. No installation, npm install, or separate HTML files are required.)
- If you run Option A (Node.js): Double-click and open
index.html(Dark Mode) orllm-guardrail-dashboard.html(Light Mode) in any browser. - If you run Option B (Executable): A borderless standalone application window will automatically open showing the dashboard. You do not need to manually open any files or URLs.
The proxy settings can be adjusted in real-time through the dashboard or by modifying config.json directly. The backend automatically watches and reloads this file upon changes.
{
"cache_enabled": true,
"hourly_budget_usd": 2.00,
"daily_budget_usd": 10.00,
"loop_detection": {
"enabled": true,
"max_requests_per_minute": 10,
"similarity_threshold_seconds": 5
},
"pricing": {
"claude-3-5-sonnet": { "input_cost_per_1k": 0.003, "output_cost_per_1k": 0.015 },
"gpt-4o": { "input_cost_per_1k": 0.005, "output_cost_per_1k": 0.015 }
}
}In the extension settings:
- API Provider: Select
OpenAI CompatibleorAnthropic. - Base URL: Enter
http://localhost:8080/v1. - API Key: Enter your real API key (the proxy forwards this securely).
Launch Aider with custom API base parameters:
# For OpenAI models:
aider --openai-api-base http://localhost:8080/v1
# For Claude models:
aider --anthropic-api-url http://localhost:8080The dashboard communicates with the backend via these lightweight admin routes:
GET /api/status- Heartbeat and server uptime checks.GET /api/stats- Fetch spend metrics, total money saved, and active configuration.GET /api/logs- Returns the last 50 requests (tokens, costs, hashes, status).POST /api/config- Update proxy settings programmatically.POST /api/clear-cache- Purges thelocal_cache.jsonstorage database.