Skip to content

Add a dedicated Chutes sidecar route without changing existing /v1/chat/completions behavior #8

@mondaylord

Description

@mondaylord

Summary

We want to integrate Chutes models into vllm-proxy while preserving the current production-stable behavior for existing users.

Instead of modifying the current /v1/chat/completions path, add a dedicated Chutes route:

  • POST /v1/chutes/chat/completions
  • GET /v1/chutes/models

This lets upstream gateways (e.g., Redpill) continue receiving standard OpenAI-style /v1/chat/completions requests and selectively forward Chutes-bound traffic to the new sidecar route.

Goals

  • Keep existing /v1/chat/completions behavior unchanged
  • Minimize coupling/risk to current production path
  • Add Chutes integration behind explicit route + config flag

Proposed behavior

Existing route (unchanged)

POST /v1/chat/completions

  • continues to forward to local vLLM backend as today

New Chutes route

POST /v1/chutes/chat/completions

  • reuses existing E2EE parse/decrypt logic on ingress
  • forwards plaintext request to Chutes OpenAI-compatible endpoint over TLS:
    • ${CHUTES_BASE_URL}/v1/chat/completions
    • with Authorization: Bearer ${CHUTES_API_KEY}
  • reuses existing E2EE encrypt logic on egress

New Chutes models route

GET /v1/chutes/models

  • proxies ${CHUTES_BASE_URL}/v1/models with Chutes auth header

Config

  • CHUTES_ENABLED (default false)
  • CHUTES_BASE_URL (default https://llm.chutes.ai)
  • CHUTES_API_KEY (required if enabled)

Security note

This is not pure client-to-model cryptographic E2EE.
It is a practical TEE-mediated segmented model:

  • Client ↔ Proxy: existing E2EE
  • Proxy ↔ Chutes: TLS
  • Plaintext is only visible inside trusted runtime boundaries

Acceptance criteria

  1. Existing route behavior remains unchanged
  2. Chutes route supports stream + non-stream
  3. Existing E2EE nonce/replay checks remain in effect
  4. Misconfig returns explicit 503 errors
  5. No API-key leakage in logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions