Skip to content

feat(ccproxy): v2.0.0 — inspector architecture, lightllm, DAG pipeline, compliance#16

Open
starbaser wants to merge 369 commits into
mainfrom
dev
Open

feat(ccproxy): v2.0.0 — inspector architecture, lightllm, DAG pipeline, compliance#16
starbaser wants to merge 369 commits into
mainfrom
dev

Conversation

@starbaser
Copy link
Copy Markdown
Owner

@starbaser starbaser commented Apr 16, 2026

AI Summary

Complete rewrite of ccproxy from a LiteLLM proxy subprocess model to an in-process mitmproxy-based transparent LLM API interceptor. This is the v2.0.0 release (tagged v2.0.0-rc1).

  • Inspector architecture: mitmweb runs in-process via WebMaster API with dual listeners — reverse proxy + WireGuard namespace jail. No subprocess, no gateway server.
  • lightllm: Surgical nerve connector into LiteLLM's BaseConfig transformation pipeline, bypassing cost tracking and callback machinery entirely.
  • DAG-based hook pipeline: @hook(reads=..., writes=...) decorator-declared data dependencies, topologically sorted via Kahn's algorithm. Per-request overrides via x-ccproxy-hooks header.
  • SSE streaming: SseTransformer stateful stream callable — parses, transforms per-chunk via LiteLLM's provider iterators, re-serializes as OpenAI-format SSE.
  • Compliance profile learning: Provider-agnostic system that observes legitimate request shapes from WireGuard traffic and stamps compliance profiles onto proxied requests.
  • Gemini/Vertex AI support: Full routing, OAuth handling, context caching via cachedContents API, path rewriting for cloudcode-pa.googleapis.com.
  • Flows CLI: ccproxy flows list/dump/diff/compare/clear with multi-page HAR 1.2 output, jq filtering, and sliding-window diff across flow sets.
  • MCP notification endpoint: POST /mcp/notify for terminal event ingestion, buffered and injected as synthetic tool_use/tool_result pairs.
  • XDG config directory: Default config moved to ~/.config/ccproxy/ (breaking change).
  • init replaces install: CLI rename (breaking change).
  • Rich pipeline visualization: render_pipeline() builds a full DAG display with parallel groups via rich.columns.Columns.

Breaking Changes

  • Config directory: ~/.ccproxy/~/.config/ccproxy/
  • CLI: ccproxy installccproxy init
  • --debug flag replaced by --log-level / -v
  • forward_port / reverse_port replaced by unified port config
  • mitm config section renamed to inspect
  • Prisma/database infrastructure removed entirely
  • LiteLLM proxy subprocess removed
  • to_mermaid / to_ascii removed from HookDAG

Test plan

  • just test passes with ≥90% coverage
  • just lint / just typecheck clean
  • Smoke test: ccproxy run --inspect -- claude --model haiku -p "what's 2+2"
  • Verify ccproxy init creates config at ~/.config/ccproxy/
  • Verify flows CLI: ccproxy flows list, ccproxy flows dump
  • Verify Gemini routing through inspector

Replace hardcoded `system = "x86_64-linux"` with a perSystem pattern
using `lib.genAttrs` over `supportedSystems`, mapping packages, devShells,
and lib outputs across both platforms.
prisma-client-py's `prisma generate` writes into site-packages/prisma/
which is read-only in the Nix store. Move generation to build time via
a new derivation that pre-fetches the Prisma CLI npm packages
(importNpmLock), copies the base prisma package with writable
permissions, and runs `prisma generate` with stub engine binaries. The
wrapper prepends PYTHONPATH so the generated package shadows the base
wheel at runtime.
Allows independent configuration of forward and reverse proxy ports,
enabling LiteLLM to keep its main port while reverse proxy listens on a
separate port. Adds confdir parameter to start_mitm for explicit CA
certificate store initialization.

BREAKING CHANGE: mitm.port renamed to mitm.forward_port; add
  mitm.reverse_port to config if using reverse proxy on
  different port
…cess-compose

- Replace devenv.nix workflow with justfile task recipes
- Replace devenv up with process-compose.yml for process management
- Update development documentation with new command structure
- Configure dedicated dev instance ports (4001-4003) to avoid production conflicts
- Replace compose.yaml with docker-compose.yaml for container orchestration
- Update MITM proxy documentation with configurable port behavior
- Add Dev Instance section to CLAUDE.md with port mapping table
…ess with OTel

Consolidate the two separate mitmdump processes (reverse + forward) into a
single mitmproxy multi-mode process using --mode reverse:...@PORT --mode
regular@port. Per-flow direction detection via flow.client_conn.proxy_mode
replaces the startup-time env var approach.

- Replace --mitm flag with --inspect (always uses mitmweb for browser UI)
- Add OTel span emission module (telemetry.py) with graceful degradation
- Add Jaeger all-in-one container for trace collection/visualization
- Add otel optional dependency group to pyproject.toml
- Add inspect_port, otel_enabled, otel_endpoint config to MitmConfig
…Docker

Prisma query engine in the mitmdump subprocess couldn't reach PostgreSQL:
localhost resolved to ::1 (IPv6) but Docker only binds 127.0.0.1 (IPv4).

- Use 127.0.0.1 instead of localhost in database_url defaults
- Resolve database_url from ccproxy.yaml when env vars aren't set
- Propagate CCPROXY_DATABASE_URL to subprocess via _build_env()
…e confinement

Replace PID-based process management with foreground-only operation and
add network namespace confinement for `ccproxy run --inspect`.

- Remove `--detach`, `stop`, `restart` commands; delete `process.py`
- `start_litellm` runs foreground with Popen child management for mitmweb
- `show_status` uses TCP health probes instead of PID files
- `start_mitm`/`start_shadow_mitm` return Popen, no PID files
- Nix module: Type=simple, no ExecStop, rename --mitm to --inspect
- `--inspect` unconditionally activates reverse + regular + wireguard modes
- Add `namespace.py`: create_namespace, run_in_namespace, cleanup_namespace
- slirp4netns bridges namespace to host via ready-fd/exit-fd lifecycle
- `ccproxy run --inspect` hard-fails when prerequisites are missing
- Add UDP port checking in preflight for WireGuard port
- Add ProxyDirection.WIREGUARD for per-flow direction detection
- Skip broken oauth refresh test (pre-existing failure)
Shadow mode (HTTP_PROXY injection via standalone mitmdump) is replaced by
WireGuard-based namespace confinement. Remove --shadow flag, ProxyMode enum,
start_shadow_mitm(), shadow loopback filter, and all shadow references.
ccproxy run --inspect is now the sole subprocess capture mechanism.
Add slirp4netns, wireguard-tools, and iproute2 to the Nix devShell
packages so `ccproxy run --inspect` has all prerequisites available.
- Add --preserve-credentials to nsenter (fixes setgroups EPERM in user ns)
- Strip wg-quick-only fields (Address, DNS) from WG conf for `wg setconf`
- Fix _ensure_combined_ca_bundle to search confdir before ~/.mitmproxy
- Set CURL_CA_BUNDLE in namespace env for curl CA trust
- Fix mitmweb web_password auth for WG client conf retrieval
- Fix /proc/net/udp6 parsing (rsplit for IPv6 addresses)
- Fix mitmweb /state response parsing (servers is dict, not list)
- Suppress slirp4netns/sentinel stderr from user output
- Remove stale config_dir param from run_preflight_checks
Remove references to --detach, stop, restart commands. Document
--inspect flag, WireGuard namespace confinement, namespace.py module,
ProxyDirection.WIREGUARD, dev ports for WireGuard and inspect UI.
In inspect mode, the MITM addon now detects LLM API domain traffic
arriving through the WireGuard tunnel and rewrites requests to target
LiteLLM instead of the original API endpoint. This eliminates the need
for base URL env vars inside the namespace (which pointed at unreachable
127.0.0.1 loopback), letting Claude CLI use its default API URLs while
traffic flows transparently through WireGuard → mitmproxy → LiteLLM.
Inspect mode replaces the old MITM mode — direction is now an internal
implementation detail. Remove --direction flag from db-prompt, collapse
status display to single inspect entry, simplify ProxyDirection to
internal mode identifiers.
Add stdenv.cc.cc.lib to tokenizers wheel buildInputs so autoPatchelf
resolves libstdc++.so.6 at build time, fixing import failures in
production Nix packages (nix profile install / Home Manager module).

Replace broken file-based ccproxy logs with journal/process-compose
delegation: mitmweb and slirp4netns output now pipes to parent stderr
with [tag] prefixes (captured by systemd journal), and the logs command
dispatches to journalctl or process-compose automatically.
Delete pipeline/validation.py (unreferenced module), unused
create_hook_spec/store_pending/get_pending from hook registry,
metrics_enabled config field with no readers, and unused
provider_name parameter on _handle_sentinel_key.

Strip ~40 redundant comments restating code, collapse identical
tokenizer branches, remove over-defensive try/except and
unreachable dict branch, clean out leftover [CACHE DEBUG] prints
and orphaned rich.print import.

Remove vestigial stop/restart from CLI subcommands set, fix stale
"ccproxy restart" suggestion, and update CLAUDE.md with missing
modules and db-prompt docs.
Unify naming: the YAML key `ccproxy.mitm` becomes `ccproxy.inspect`,
MitmConfig class becomes InspectConfig. The `src/ccproxy/mitm/` package
directory is unchanged (internal module name).
…onfig

Remove enabled, forward_port, reverse_port, upstream_proxy from YAML
template and Nix defaults. These are internal implementation details
auto-derived from LiteLLM port config — only exposed in devShell
overrides for port deconfliction.
SSL_CERT_FILE now validates the path exists before trusting it, with
fallback chain: certifi → system CA bundle. MITM combined CA bundle
creation moved after mitmproxy starts (was running before, so the CA
cert didn't exist on fresh setups). All four cert env vars now set
for the LiteLLM subprocess (SSL_CERT_FILE, REQUESTS_CA_BUNDLE,
CURL_CA_BUNDLE, NODE_EXTRA_CA_CERTS) matching run --inspect behavior.
…nfig

Package rename: src/ccproxy/mitm/ → src/ccproxy/inspector/
- InspectConfig → InspectorConfig with field cleanup
- CCProxyMitmAddon → InspectorAddon
- MitmTracer → InspectorTracer
- start_mitm → start_inspector, get_mitm_status → get_inspector_status
- YAML key: ccproxy.inspect → ccproxy.inspector
- JSON status key: "mitm" → "inspector"
- Env vars: CCPROXY_MITM_* → CCPROXY_INSPECTOR_*

Config model changes:
- New MitmproxyOptions pydantic stub exposes mitmproxy --set flags
  (ssl_insecure, stream_large_bodies, web_password, etc.)
- Removed forward_port, reverse_port, upstream_proxy (auto-derived)
- inspect_port → port
- Added provider_map for OTel gen_ai.system attribute mapping
- Extracted otel_enabled/endpoint/service_name to top-level OtelConfig
- Model validator syncs cert_dir → mitmproxy.confdir
- start_inspector() consumes InspectorConfig directly
- No auto-generated web_password — user opts in via config

Deleted docs/mitm.md (entirely stale).
Replace Prisma-based trace storage with OTel-only telemetry. The inspector
addon now emits OTel spans exclusively — no database writes.

- Delete storage.py (Prisma ORM wrapper), prisma/ schema, nix/prisma-cli/
- Remove db CLI commands (db-sql, db-gql, db-prompt) and ~760 lines of handlers
- Remove ccproxy-db and ccproxy-graphql docker services
- Remove prisma, asyncpg dependencies from pyproject.toml
- Remove prismaGenerated from flake.nix package wrapper
- Gut storage parameter from InspectorAddon and TraceStorage wiring from script.py
- Remove ensure_prisma_client, _auto_generate_prisma, _resolve_database_url from process.py
- Remove InspectorConfig.database_url field
- Load OtelConfig from ccproxy.yaml instead of CCPROXY_OTEL_* env vars
- Fix misc ruff/mypy diagnostics across pipeline, hooks, utils
Remove database_url, graphql, max_body_size, excluded_hosts from
inspector defaults. Add otel section with disabled-by-default config
matching OtelConfig pydantic model.
The WireGuard listener port is internal plumbing — users connect to the
LiteLLM HTTP port, not the WG tunnel. Auto-assign a free UDP port at
inspector startup instead of exposing it in InspectorConfig.

- Remove wireguard_port from InspectorConfig, nix defaults, template
- Add _find_free_udp_port() in process.py for auto-assignment
- _rewrite_wg_endpoint() parses port from mitmweb's client config
  instead of receiving it as a parameter
- create_namespace() drops wg_port parameter
- Remove UDP preflight check (port is ephemeral)
- Update CLAUDE.md: remove Prisma/db/GraphQL/PostGraphile references,
  update inspector and Docker container docs for OTel-only architecture
mitmproxy 12.x always requires web authentication — if no web_password
is set, it auto-generates a random token, blocking programmatic /state
API access (403 on every request). This broke WireGuard client config
retrieval needed for namespace jail mode.

Generate a secrets.token_hex(16) when no explicit web_password is
configured, pass it via --set web_password=<token>, and return it from
start_inspector() so the caller can authenticate /state requests.
…isolation

Each `ccproxy start --inspect` now stores its WireGuard keypair at
`{config_dir}/wireguard.{pid}.conf`, allowing multiple independent
ccproxy stacks to coexist without conflicting on the shared mitmproxy
confdir. The CA cert remains shared so clients only trust one CA.

- Remove `wireguard_conf_path` from InspectorConfig (now internally managed)
- Pass `wireguard_conf_path` as explicit parameter to `start_inspector()`
- Clean stale `.inspector-wireguard-client.conf` on each startup
- Clean up PID-tagged WG keypair on shutdown
- Add preflight cleanup of orphaned `wireguard.*.conf` for dead PIDs
- Enable `--inspect` by default in process-compose dev config
starbaser added 16 commits May 4, 2026 18:23
….claude/.credentials.json with claudeAiOauth.* glom paths
…se with glom-configurable credential paths

Renamed `_OAuthFields` → `AuthFields` and dropped the `OAuth` prefix from
the per-provider classes (`{Command,File,Anthropic,Google}AuthSource`) so
the names cover non-OAuth credential sources cleanly.

A new `AuthSource(AuthFields)` refresh base absorbs the shared
read → maybe-refresh → write-back template method that previously lived
duplicated across `oauth/anthropic.py` and `oauth/google.py`. Subclasses
now provide only the per-provider POST body via `_build_refresh_body`
and a few default overrides (`endpoint`, `file_path`, `client_id`,
`default_expires_in_seconds`, `expiry_path`).

Three glom-configurable paths (`access_path`, `refresh_path`,
`expiry_path`) make the credential schema declarative — set them to
`claudeAiOauth.accessToken` etc. to share `~/.claude/.credentials.json`
with the Claude Code CLI without renaming on-disk keys. `_write_credentials`
deep-copies the input and uses `glom.assign(..., missing=dict)` so
nested writes preserve sibling fields (`scopes`, `subscriptionType`)
that the host CLI wrote.

Deleted `oauth/anthropic.py` and `oauth/google.py`; their content is
now method bodies on `AuthSource`. The discriminated-union alias is
renamed `OAuthSource` → `AnyAuthSource` so the class name `AuthSource`
is unambiguous, and `parse_oauth_source` → `parse_auth_source`.
Eliminate redundant json.loads of the request body across the inspector
addon. _extract_session_id and _enrich_record_with_conversation_ids
now read through FlowRecord.parsed_request_body, a parse-once cache
keyed on the record. _extract_session_id becomes a static
_extract_session_id_from_body that consumes the cached dict.

Pipeline-side Context._body lazy-parse stays as-is — its lifecycle is
per-pipeline-invocation, not per-flow.
Lifts the response-side 401 detect → refresh → replay loop out of
InspectorAddon into its own mitmproxy addon (ccproxy.inspector.oauth_addon).
The new addon owns nothing else, keeping its responsibility surface single.

Trigger contract is unchanged: forward_oauth stamps
flow.metadata["ccproxy.oauth_injected"] and ["ccproxy.oauth_provider"];
OAuthAddon.response reads those and replays the request when it sees a 401
on a flow ccproxy injected.

Registered before InspectorAddon during the Phase E transition so the retry
runs before InspectorAddon's still-resident capacity-fallback and Gemini
envelope-unwrap branches see the response. Wave 6 will move those branches
into a dedicated GeminiAddon, after which the addon chain becomes more
linear.

InspectorAddon shrinks by ~50 LOC (51 lines deleted). Unit tests for the
retry behavior move from tests/test_inspector_addon.py to
tests/test_oauth_addon.py and grow from 11 → 17 cases (added: response()
gate behavior, http error swallowing, body+method preservation).
Extracts the response-side Gemini envelope unwrap (both the streaming
EnvelopeUnwrapStream install and the buffered unwrap_buffered call) out
of InspectorAddon into a dedicated GeminiAddon. Registered after
InspectorAddon so its responseheaders can install
EnvelopeUnwrapStream on the streaming Gemini redirect flows that
InspectorAddon now leaves untouched.

Phase E.2 of the structural addon split. The capacity-fallback defer
branch in InspectorAddon.responseheaders and the try_fallback_models
dispatch in InspectorAddon.response stay untouched for one more commit;
Wave 6 (Phase E.3) absorbs both into GeminiAddon and dissolves the
gemini_capacity_fallback hook module entirely.
…lve fake hook

The Gemini RESOURCE_EXHAUSTED retry orchestration (sticky retries on the
original model, then walking a fallback chain) moves from
``ccproxy/hooks/gemini_capacity_fallback.py`` (deleted) onto
``ccproxy/inspector/gemini_addon.py``. The legacy file was a fake ``@hook``
shell that just stashed config in a module-global; the actual orchestrator
ran from ``InspectorAddon.response``. With this commit:

- ``GeminiAddon.responseheaders`` owns the capacity-defer branch (skip
  ``EnvelopeUnwrapStream`` install on a 429/503 when fallback is enabled
  so mitmproxy buffers the body for retry).
- ``GeminiAddon.response`` runs ``_try_fallback_models`` first, then the
  envelope unwrap looks at the (possibly retry-replaced) response.
- ``InspectorAddon.responseheaders`` loses the capacity-defer branch;
  ``InspectorAddon.response`` loses the capacity-fallback dispatch.

Pydantic params graduate from a fake hook's ``model=`` to a real
``CCProxyConfig.gemini_capacity: GeminiCapacityFallbackConfig`` block.
The legacy ``hooks.outbound: ccproxy.hooks.gemini_capacity_fallback``
entry is now a hard load-time error with a clear migration message — no
backwards-compat shim, per Kyle's "backwards compatibility is useless"
doctrine.

Final addon chain (Phase E end-state): ``InspectorAddon → MultiHARSaver
→ ShapeCapturer → inbound pipeline → transform → outbound pipeline →
OAuthAddon → GeminiAddon``. The transitional Wave 4 placement
(``OAuthAddon`` before ``InspectorAddon``) is reversed to the plan's
final shape; ``OAuthAddon.response`` runs before ``GeminiAddon.response``
so a 401 → refresh → replay → 429 sequence naturally cascades into
capacity fallback.
Wave 6 dissolved ``ccproxy.hooks.gemini_capacity_fallback`` into
GeminiAddon and graduated its params to ``CCProxyConfig.gemini_capacity``.
The transitional load-time RuntimeError that flagged stale config entries
has outlived its purpose: per the no-backwards-compat doctrine, just
delete it. Stale entries in users' configs will now silently fall through
the hook registry (no module by that path resolves), and the rebuild that
ships this commit also regenerates Nix-store YAMLs from the already-clean
``nix/defaults.nix``.

Removes ``_reject_legacy_capacity_fallback_hook`` and its call site in
``CCProxyConfig.from_yaml``, plus the corresponding
``TestLegacyCapacityFallbackHookEntry`` test class.
…ct token-freshness contract

Carve the semantic line between the two static credential value loaders
(CommandAuthSource / FileAuthSource) and the OAuth refresh-capable base
(AuthSource, with AnthropicAuthSource and GoogleAuthSource subclasses):

- README.md: add an "Auth source types" subsection in Configuration with
  a four-row type table (command, file, anthropic_oauth, google_oauth)
  and a paired DeepSeek (static API key) + Anthropic (oauth refresh)
  Provider config example.
- docs/configuration.md: correct the "Required keys" column for
  anthropic_oauth/google_oauth (file_path, not refresh_token_file); add
  an "Auth source class hierarchy" diagram explaining AuthFields →
  CommandAuthSource / FileAuthSource / AuthSource subclasses; add an
  "OAuth refresh lifecycle" subsection covering the 60s expiry headroom,
  the deepcopy + glom.assign(missing=dict) sibling-preservation pattern,
  the gemini-cli #21691 refresh_token fallback, the from_yaml →
  _load_credentials → prewarm_project startup ordering, and a
  "Why Gemini wants google_oauth" subsection explaining how type:command
  silently breaks prewarm_project's loadCodeAssist call when the on-disk
  token is expired at startup.
- CLAUDE.md: rewrite the oauth/ subsystem bullet to describe the
  AuthFields / AuthSource hierarchy (no separate oauth/anthropic.py /
  oauth/google.py modules anymore); add a Gemini recommendation to the
  Providers & Sentinel Keys section; update the Triage Principle's
  reference from GoogleOAuthSource → GoogleAuthSource.
Reformat 21 files (11 source + 10 test) that drifted from `ruff format`
output during the recent refactor — collapses fitting-on-one-line
function calls / docstrings / generator expressions, no behavior change.
`ruff format --check .` is now clean.
…nder inspector

Fix stale debug→log_level references and remove dissolved
gemini_capacity_fallback hook from all docs. Add missing sections for
logging, upstream timeout, gemini_capacity fallback, mitmproxy options,
and anthropic billing header. Move readiness probe fields from top-level
to inspector.readiness with process-compose-style naming (url,
timeout_seconds), dropping the boolean toggle in favor of null url.
Refresh against current code: drop stale references, fold in verified
enrichment fields on FlowRecord, subprocess loggers, TermLog disable,
and homeModules.ccproxy export. Rewrite the dev/prod section as a
top-down architecture reference covering nix/defaults.nix → mkConfig
(dev) / nix/module.nix (production HM) / render_template.py.

Gitignore CLAUDE.local.md so per-machine production notes stay out of
the repo.
…prune agent-sdk

Add `sdk` optional dependency group (google-genai, openai) so users can
install example dependencies with `uv add claude-ccproxy[sdk]`.

New examples in docs/sdk/:
- gemini_sdk.py — google-genai SDK with Gemini sentinel key
- deepseek_sdk.py — Anthropic SDK with DeepSeek sentinel key
- lightllm_transform.py — OpenAI SDK through lightllm cross-format
  transform to Anthropic and Gemini

Removed: agent_sdk_caching_example.py (Claude Agent SDK, not Anthropic SDK)
and examples/litellm_sdk.py (duplicate of docs/sdk/litellm_sdk.py).
The cross-format transform was broken in several spots that combined to
produce empty or wrong-prompt responses on every OpenAI→Gemini call,
and direct google-genai SDK calls silently degraded to empty prompts.

- gemini_cli outbound hook overwrote the TransformMeta the route
  handler had stamped, dropping mode="transform" back to the default
  "redirect" so the response handler skipped re-serialization. Only
  create a TransformMeta now when none exists upstream.
- _detect_incoming_format had no /gemini/ prefix variant (anthropic
  already has /anthropic/), so SDK base_url=".../gemini" calls were
  classified as cross-format and LiteLLM stripped Gemini's `contents`
  to empty before forwarding.
- After _handle_redirect rewrote the path to /v1internal:{action},
  gemini_cli could no longer extract the model from the path; fall
  back to TransformMeta.model so the v1internal envelope carries it.
- Buffered transform_to_openai now unwraps cloudcode-pa's
  {response: {...}} envelope inline, since GeminiAddon.response (the
  usual unwrap point) runs later in the addon chain.
- SseTransformer splits on \r\n\r\n (the actual cloudcode-pa boundary)
  as well as \n\n, and unwraps the envelope per chunk for
  Gemini-family providers so the GeminiIterator sees raw chunks.

Also in this change:

- Finish the readiness refactor: flat verify_readiness_on_startup,
  readiness_probe_url, readiness_probe_timeout_seconds on
  CCProxyConfig; drop the nested ReadinessProbeConfig and update the
  inspector probe + tests to match.
- Default anthropic provider reads ~/.claude/.credentials.json via jq
  (works on any machine with Claude Code logged in); gemini switches
  to type=google_oauth with gemini-cli's installed-app credentials so
  token refresh happens in-process.
- SDK examples honor CCPROXY_BASE_URL; litellm_sdk uses the real
  anthropic sentinel instead of sk-proxy-dummy (which never resolved
  to a provider and 501'd).
- Dev shell runs uv sync --extra sdk so google-genai stays in .venv,
  and exports CCPROXY_BASE_URL=http://127.0.0.1:4001 so the SDK
  examples target the dev instance by default.
…llback header staleness

`OAuthAddon._retry_with_refreshed_token` now writes the refreshed token onto
`flow.request.headers[target_header]` before issuing the replay, so downstream
addons (`GeminiAddon` capacity fallback) re-fire with the current token instead
of inheriting the pre-refresh stale one. This was the root cause of production
flow ca32b740 — a real 429 on `gemini-3.1-pro-preview` cascaded into a 401
storm because the fallback's local httpx client copied a stale token from
`flow.request.headers`.

Drop the in-memory `_cached_auth_tokens` dict, `get_oauth_token`,
`_resolve_oauth_token`, `refresh_oauth_token`, and `_load_credentials`. Replace
with `CCProxyConfig.resolve_oauth_token(provider)` — wraps `Provider.auth.resolve()`
under the existing per-provider lock, reads disk every call. External writers
(claude-cli, gemini-cli) sharing the credential file now propagate immediately;
no more one-way mirror that only invalidates on a 401.

Strip the no-sentinel fallback walk from `forward_oauth`. Picking the first
provider with a cached token by YAML order was a credential-leak waiting to
happen — sentinel-or-nothing is the only sane contract.

Consolidate `CredentialSource` into `AnyAuthSource` for `web_password`
(`AnyAuthSource | str | None` with a coercing field validator). Delete the
parallel class.

Regression test in tests/issues/regression/ asserts `flow.request.headers` is
updated post-refresh for both default Bearer and custom-header paths.
Extends Gemini capacity fallback to handle backend INTERNAL errors in
addition to RESOURCE_EXHAUSTED. The retry_status_codes config now
defaults to [429, 503, 500].
Comment thread src/ccproxy/config.py Dismissed
starbaser added 12 commits May 6, 2026 21:48
…fix cache

Adds a deterministic UUID5 session_id derived from (model, project, conversation)
into the v1internal envelope's request object, matching real Gemini CLI wire
format (verified against captured flow at .config/ccproxy/compliance/seeds).
Empirically confirmed the server-side cache engages — 98.2% of prompt tokens
served from cache on a third same-conversation request via cachedContentTokenCount.

Stable across daemon restarts (no per-process anchor) and across model tier
changes within a logical conversation. Pre-existing user_prompt_id top-level
field is unchanged.

addon.py also extends conversation_id derivation to handle Gemini-shape
contents (was Anthropic messages only); without this, native Gemini traffic
would always fall back to flow.id and never share a session_id across turns.

Adds 'just restart' for the dev daemon — 'just up' alone is idempotent and
won't pick up source changes.
Removes tests duplicated by parametrized siblings (e.g., test_get_header_returns_value
subsumed by test_get_header_exact_key_match) and trivial constant-equality assertions
(InspectorMeta.RECORD value, contentview .name/.syntax_highlight). Bumps nixpkgs
and uv2nix to current upstream.
Cross-checked every markdown file against the source after the recent
inspector extractions (OAuthAddon, GeminiAddon, MultiHARSaver,
ShapeCapturer) and the shape-replay subsystem rewrite. Corrects stale
hook names (apply_shaping → shape), addon-chain enumerations, OAuth
401-retry mechanics, and Gemini envelope-unwrap file paths. Removes
docs/llms/ (vendored litellm reference material) and .claude/AGENTS.md.

Largest rewrites: USAGE.md §6 (replaced obsolete passive-learning
shaping description with a pointer to docs/shaping.md) and
skills/using-ccproxy-api/reference/troubleshooting.md (full rewrite —
referenced three nonexistent helper scripts and the obsolete shaping
system throughout).
Improves readability of hook parameter signatures in the pipeline render
output by displaying each parameter on its own line with YAML-style
formatting instead of inline comma-separated values.
Implements Perplexity Pro as a ccproxy-internal BaseConfig registered in
lightllm/registry.py, routing to
www.perplexity.ai/rest/sse/perplexity_ask with session-token cookie
auth. Includes PerplexityProIterator for delta-chunk conversion and
perplexity_signin.py script for Gmail-OTP token refresh.
Adds a TLS+HTTP/2 fingerprint impersonation path so Cloudflare-fronted
upstreams (ChatGPT/Codex first; others as needed) stop flagging ccproxy's
stock pyOpenSSL ClientHello. Default behaviour is unchanged — mitmproxy's
native transport stays the default until a Provider opts in.

R1 — transport/dispatch.py + transport/__init__.py
  Cached httpx.AsyncClient per (host, profile), backed by
  httpx-curl-cffi's AsyncCurlTransport. LRU=16 + 60s idle eviction.
  Profile names validated against curl_cffi.requests.impersonate
  BrowserTypeLiteral at the cache boundary; UnknownFingerprintProfileError
  on misconfiguration. DEFAULT_PROFILE = chrome131.

R2 — swap retry httpx for the cached dispatcher
  oauth_addon._retry_with_refreshed_token and gemini_addon._attempt_request
  now use transport.get_client(host=..., profile=...). Profile is read
  from flow.metadata['ccproxy.fingerprint_profile'] with the default as
  fallback. Stamps flow.metadata['ccproxy.retry_transport'] = 'curl_cffi'
  and ['ccproxy.retry_profile'] for observability.

R3 — sidecar + TransportOverrideAddon + Provider.fingerprint_profile
  In-process Starlette+uvicorn HTTP server bound to 127.0.0.1:<auto>,
  started before WebMaster, stopped after master_task. Two-header
  contract: X-CCProxy-Target-Url + X-CCProxy-Impersonate. Forwards via
  the cached httpx-curl-cffi client and streams responses chunk-by-chunk
  through client.send(stream=True) + aiter_raw(); hop-by-hop stripped
  both directions.

  TransportOverrideAddon slots between the outbound DAG and OAuthAddon.
  When the resolved Provider has fingerprint_profile != None, it
  rewrites flow.request.host/port/scheme to the sidecar and stashes the
  real target URL + profile in headers. The R3-spike confirmed
  mitmproxy doesn't invoke flow.response.stream from a request()-hook
  short-circuit, so a sidecar with native mitmproxy upstream streaming
  was the path through.

  Provider.fingerprint_profile (str | None) validated against
  transport.VALID_PROFILES; None default preserves status quo.

R4 — inspector fidelity for impersonated flows
  SSLKEYLOGFILE alongside MITMPROXY_SSLKEYLOGFILE so curl-cffi writes
  session keys into the same tls.keylog; Wireshark decrypts every leg
  from one file. FlowRecord.forwarded_request snapshot (post-pipeline
  pre-rewrite) populated by TransportOverrideAddon; MultiHARSaver uses
  it so ccproxy flows compare/dump show the real upstream URL instead
  of 127.0.0.1:<sidecar>. New ForwardedRequestContentview surfaces it
  in mitmweb's flow detail panel.

Plus: pre-existing SseTransformer tests in test_response_transform.py
fixed (5 stale assertions expected b'' where the impl correctly
returns [] to avoid emitting the chunked-encoding EOS marker).

curl-cffi pyprojectOverrides entry in flake.nix mirrors the existing
tokenizers override so the Nix-built derivation patches libstdc++.so.6
into the wheel's RPATH.

Full suite: 1423 passed, 0 failed.
…oot handler streams

Uvicorn's default LOGGING_CONFIG runs through logging.config.dictConfig(),
which calls _clearExistingHandlers() unconditionally (regardless of
disable_existing_loggers). That closes every root-logger handler's stream
— including ccproxy's FileHandler for ccproxy.log — leaving only the first
line that landed before Sidecar.start() ran. Stderr still got logs because
process-compose captures stdout/stderr at the process level, but
ccproxy.log was effectively single-line and `ccproxy logs` returned almost
nothing useful.

Set log_config=None so uvicorn skips its logging setup entirely.
ccproxy's setup_logging is the single source of truth.
… matrix + QEMU release gate

Adds three-tier install validation so the package actually works for users
who don't run NixOS:

- Tier 1+2 GHA workflow (.github/workflows/validate-install.yml): nix
  flake check, uv-built wheel as artifact, container matrix over
  debian:12, ubuntu:24.04, fedora:44, archlinux:latest, plus a
  macos-latest job for the reverse-proxy code path.
- Tier 3 local QEMU+KVM release test (scripts/qemu_release_test.sh):
  boots a vanilla cloud image, scp's the wheel, pip-installs it, and
  runs smoke + daemon-start checks. Supports debian-12, ubuntu-24.04,
  fedora-44. Wired up via `just release-test-qemu` / `release-test-qemu-all`.

Required dep changes:

- Swap `xepor>=0.6.0` for `xepor-ccproxy>=0.7.0`. Upstream xepor 0.6.0 is
  unmaintained (last release 2023-07-06) and pins `mitmproxy<10.0.0`,
  which made the wheel uninstallable from PyPI for non-Nix users (the
  `[tool.uv] override-dependencies` workaround only applied locally, not
  to downstream consumers). xepor-ccproxy is our fork
  (github.com/starbaser/xepor, branch ccproxy/mitmproxy12, tag v0.7.0)
  with the mitmproxy-12 Server(address=...) fix, wildcard host support,
  request/response routeless short-circuit, and mitmproxy<14 constraint.
  Upstream PR pending.
- Drop the [tool.uv] override-dependencies block (no longer needed).
- cli.py: remove check_namespace_capabilities() preflight from
  _run_inspect (the daemon side). The daemon itself doesn't use
  Linux namespaces — that's only the `ccproxy run --inspect` path —
  so the check was over-eager and prevented `ccproxy start` from
  working on macOS in reverse-proxy mode.
- Add `Operating System :: POSIX :: Linux` and
  `Operating System :: MacOS :: MacOS X` classifiers.
- flake.nix devShell: add qemu_kvm + cloud-utils for local Tier 3 runs.

README: new Installation section with per-platform (Linux / WSL2 /
macOS) install instructions, the system-package list per distro,
AppArmor unprivileged-userns sysctl note for Ubuntu 24.04+, and a
platform-support matrix. Old Troubleshooting > Inspector prerequisites
collapsed to point back to Installation.

Validated end-to-end on debian-12, ubuntu-24.04, fedora-44 via QEMU+KVM.
macOS runners bill at 10x rate compared to Linux runners
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants