Skip to content

PRATHAM1084/llm-guardrail

Repository files navigation

LLM-Guardrail 🛡️

A lightweight, real-time local proxy server, token firewall, and cost optimizer for developer AI workflows. Run it locally, point your coding agents (like Cline, Aider, or custom scripts) to it, and stop worrying about runaway token costs or privacy leaks.


Why LLM-Guardrail?

If you use AI coding assistants or develop agentic loops, you probably face:

  1. Bill Shock / Runaway Costs: An infinite recursive loop in your code can call cloud APIs (Claude, OpenAI) repeatedly, charging hundreds of dollars in minutes.
  2. Privacy Concerns: Your proprietary code is sent directly to external cloud API endpoints.
  3. Usage Limits: Message limits (e.g. Claude Pro limits) are consumed quickly.

LLM-Guardrail acts as a local interceptor that prevents these issues by:

  • Caching identical prompts locally to serve them instantly for $0.00.
  • Implementing strict hourly/daily dollar budget caps.
  • Detecting infinite loops and request floods, immediately blocking subsequent calls.
  • Providing a sleek, real-time developer dashboard to manage settings and monitor logs.

Features

🖥️ Real-time Dashboard

We package two stunning frontends that run directly in your browser:

  • Default Dark Mode (index.html): Glassmorphic cyberpunk command center dashboard.
  • Warm Light Mode (llm-guardrail-dashboard.html): Claude-inspired editorial styling.
  • Both frontends poll stats, render relative logs, toggle cache/budget options, and save configuration rules in real-time. If the backend is offline, they automatically run in a safe Simulation Preview Mode.

⚙️ Proxy Firewall (Backend)

  • OpenAI compatible route: Intercepts POST /v1/chat/completions.
  • Anthropic compatible route: Intercepts POST /v1/messages.
  • Budget Enforcer: Blocks requests that exceed your defined hourly/daily limit.
  • Loop Prevention: Monitors prompt signatures; if identical prompts are sent repeatedly in a short duration, the proxy blocks them with a 429 error.
  • Response Cache: Stores completions locally in local_cache.json using SHA-256 signatures, cutting API costs down significantly.

Quick Start

1. Prerequisites

Ensure you have Node.js installed (v18+ recommended).

2. Installation

Clone the repository and install dependencies:

git clone https://github.com/your-username/llm-guardrail.git
cd llm-guardrail
npm install

3. Running the Server

Option A: Running with Node.js

Start the proxy server on http://localhost:8080:

npm start

Option B: Running the Standalone Windows Executable

You can run the precompiled standalone executable directly (perfect for distributing as a single release binary):

./llm-guardrail.exe

(This executable is completely self-contained! It embeds the Node.js runtime, dependency modules, and the HTML dashboard UI templates directly inside its binary. No installation, npm install, or separate HTML files are required.)

4. Viewing the Dashboard

  • If you run Option A (Node.js): Double-click and open index.html (Dark Mode) or llm-guardrail-dashboard.html (Light Mode) in any browser.
  • If you run Option B (Executable): A borderless standalone application window will automatically open showing the dashboard. You do not need to manually open any files or URLs.

Configuration (config.json)

The proxy settings can be adjusted in real-time through the dashboard or by modifying config.json directly. The backend automatically watches and reloads this file upon changes.

{
  "cache_enabled": true,
  "hourly_budget_usd": 2.00,
  "daily_budget_usd": 10.00,
  "loop_detection": {
    "enabled": true,
    "max_requests_per_minute": 10,
    "similarity_threshold_seconds": 5
  },
  "pricing": {
    "claude-3-5-sonnet": { "input_cost_per_1k": 0.003, "output_cost_per_1k": 0.015 },
    "gpt-4o": { "input_cost_per_1k": 0.005, "output_cost_per_1k": 0.015 }
  }
}

How to Integrate with Coding Tools

1. Cline (VS Code Extension)

In the extension settings:

  • API Provider: Select OpenAI Compatible or Anthropic.
  • Base URL: Enter http://localhost:8080/v1.
  • API Key: Enter your real API key (the proxy forwards this securely).

2. Aider CLI

Launch Aider with custom API base parameters:

# For OpenAI models:
aider --openai-api-base http://localhost:8080/v1

# For Claude models:
aider --anthropic-api-url http://localhost:8080

Admin API Reference

The dashboard communicates with the backend via these lightweight admin routes:

  • GET /api/status - Heartbeat and server uptime checks.
  • GET /api/stats - Fetch spend metrics, total money saved, and active configuration.
  • GET /api/logs - Returns the last 50 requests (tokens, costs, hashes, status).
  • POST /api/config - Update proxy settings programmatically.
  • POST /api/clear-cache - Purges the local_cache.json storage database.