Add Origin header validation for DNS-rebind protection#4908
Conversation
70df577 to
f2aa438
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4908 +/- ##
==========================================
+ Coverage 68.77% 68.82% +0.04%
==========================================
Files 629 630 +1
Lines 63914 64032 +118
==========================================
+ Hits 43955 44068 +113
- Misses 16703 16705 +2
- Partials 3256 3259 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jhrozek
left a comment
There was a problem hiding this comment.
A few clarifying questions on wiring/coverage and a couple of style questions. Nothing blocking — just want to make sure the integration paths and API shape are intentional. Implementation itself looks solid; the spec compliance is correct.
| // the middleware disabled. | ||
| func WithAllowedOrigins(origins []string) RunConfigBuilderOption { | ||
| return func(b *runConfigBuilder) error { | ||
| b.config.AllowedOrigins = origins |
There was a problem hiding this comment.
The new addOriginMiddleware is added to PopulateMiddlewareConfigs, but WithMiddlewareFromFlags / addCoreMiddlewares (the path used by thv run) doesn't include it. Since runner.Run skips PopulateMiddlewareConfigs when MiddlewareConfigs is pre-populated (runner.go:232), thv run --allowed-origins=... plumbs the flag into RunConfig.AllowedOrigins — which is exactly what this builder option does — but the middleware never registers at runtime.
Is this intentional, or an omission? If intentional, what's the rationale for excluding thv run from the protection?
There was a problem hiding this comment.
Good catch — and the fix needed to go deeper than moving the call into WithMiddlewareFromFlags. The builder resolves the effective port in validateConfig (via WithPorts), which runs after all options, so b.config.Port is still 0 during WithMiddlewareFromFlags — ResolveAllowedOrigins would have returned nil and silently disabled loopback defaults for thv run. Fixed by wiring Origin validation centrally in runner.Run (new prependOriginMiddleware), after both population paths, where Host/Port/AllowedOrigins are fully resolved. Removed it from PopulateMiddlewareConfigs so there's one wiring site, and prepended it so it still runs first (rejects before auth). Added TestPrependOriginMiddleware covering the thv run path.
| // loopback-only allowlist is derived at middleware-wiring time. | ||
| // When empty and Host is non-loopback, the middleware is disabled — operators | ||
| // exposing the proxy publicly must configure an explicit allowlist. | ||
| AllowedOrigins []string `json:"allowed_origins,omitempty" yaml:"allowed_origins,omitempty"` |
There was a problem hiding this comment.
The operator path uses PopulateMiddlewareConfigs so the factory side is wired, but MCPServerSpec / MCPRemoteProxySpec / VirtualMCPServerSpec have no AllowedOrigins field. Combined with operator-deployed pods binding to non-loopback addresses, ResolveAllowedOrigins returns nil and addOriginMiddleware skips registration with a WARN — so K8s deployments ship with Origin validation disabled.
Is this expected, planned for a follow-up PR, or considered out of scope for the CRDs? If a follow-up, would it be worth a // TODO or a note in the PR description so it's tracked?
There was a problem hiding this comment.
Deferring CRD wiring to a follow-up to keep this PR focused on CLI/proxyrunner. Documented explicitly in the commit message and in the prependOriginMiddleware doc comment: operator non-loopback pods log the WARN rather than enforcing until the CRD field lands. Tracked in a follow-up issue: "Add allowedOrigins to MCPServer/MCPRemoteProxy/VirtualMCPServer CRDs + operator wiring."
| // middleware is skipped entirely and a WARN is logged so the security-disabled | ||
| // state is visible in operator logs. A follow-up PR hardens the non-loopback | ||
| // path by requiring an explicit opt-in flag (see audit row 22). | ||
| func addOriginMiddleware(middlewares []types.MiddlewareConfig, config *RunConfig) ([]types.MiddlewareConfig, error) { |
There was a problem hiding this comment.
The commit message says the middleware is wired into "thv run / thv-proxyrunner / vMCP", but vMCP composes its own chain via factory.NewIncomingAuthMiddleware and pkg/vmcp/server/server.go Handler — neither references the new origin package. Grep across pkg/vmcp/ and cmd/vmcp/ finds no usage.
Should vMCP be wired here too, or should the commit message be corrected to drop the vMCP claim and tracked as a follow-up?
There was a problem hiding this comment.
Correct — vMCP composes its own chain and doesn't reference this package. Dropped the "vMCP" claim from the commit message and called out vMCP wiring as a follow-up. Tracked in a follow-up issue: "Wire Origin validation into the vMCP middleware chain."
| // host ASCII-case-insensitive; the port is a decimal integer and has no case. | ||
| // Malformed inputs (no "://" separator) are returned lowercased in full on the | ||
| // assumption that they will simply not match any legitimate allowlist entry. | ||
| func canonicalizeOrigin(raw string) string { |
There was a problem hiding this comment.
Was net/url.Parse considered for this function, and if so, what made hand-rolled parsing the better choice?
Asking because the current implementation has a few RFC-6454 edge cases that url.Parse would handle for free: userinfo (https://user:pass@host) is preserved rather than rejected per §6, IPv6 bracket detection is hand-rolled, and IDN/punycode normalization isn't applied. Curious if these are deliberate tradeoffs (e.g., avoiding a dep on golang.org/x/net/idna, or the cost of full URL parsing on the hot path).
There was a problem hiding this comment.
Switched to net/url.Parse. It now rejects userinfo per RFC 6454 §6 (closing an Origin-smuggling vector like http://localhost:8080@evil.example), handles IPv6 brackets and host/port via Hostname()/Port(), and lowercases scheme+host per §4. Unparseable / scheme-less / host-less / userinfo inputs map to a sentinel that can never match (fail closed). Added tests for these cases. IDN/punycode normalization is intentionally not applied — matching stays on the literal serialized form; can add x/net/idna in a follow-up if we want host equivalence.
| // and host portions of the Origin value are lowercased (RFC 6454 §4: scheme | ||
| // and host are ASCII-case-insensitive). Configured allowlist entries are | ||
| // lowercased once at construction time. | ||
| func CreateOriginMiddleware(allowedOrigins []string) types.MiddlewareFunction { |
There was a problem hiding this comment.
What's the purpose of having both CreateOriginMiddleware (exported, single-line forwarder) and createOriginHandler (unexported, the actual implementation)? The factory path bypasses the public API and calls the unexported one directly, and tests also use the unexported one — so the only caller of CreateOriginMiddleware is cmd/thv/app/proxy.go.
Could one of them be removed, or is there a reason to keep both as separate entry points (e.g., a planned divergence in behavior)?
There was a problem hiding this comment.
Collapsed into a single exported entry point NewHandler, used by both the factory (CreateMiddleware) and cmd/thv/app/proxy.go. Removed the duplicate forwarder.
ToolHive's proxy layer had no Origin-header validation, and the legacy HTTP+SSE transport sent `Access-Control-Allow-Origin: *`, leaving both modes open to DNS-rebinding attacks from browser clients. MCP 2025-11-25 §"Security Warning" requires servers to validate Origin on all connections and respond 403 when the value is invalid. This change introduces a dedicated middleware at pkg/transport/middleware/origin/ that rejects requests whose Origin header is present and not in an operator-configured allowlist. It is wired centrally in runner.Run for both the factory-based chain (thv run / thv-proxyrunner) and inline in the `thv proxy` chain. Wiring is done in runner.Run rather than in the builder because that is the only point where the effective Host/Port/AllowedOrigins are fully resolved: the CLI builder (WithMiddlewareFromFlags) defers port resolution to validateConfig, so an earlier hook would see port 0 and silently disable loopback-default protection for `thv run`. The middleware is prepended so Origin validation runs first in the chain. Behavior: - New --allowed-origins flag on `thv run` and `thv proxy` accepts a repeatable exact-match list. When empty and the bind host is loopback, a default loopback-only allowlist is derived automatically (http://localhost:PORT + 127.0.0.1 + [::1]). When empty and the bind is non-loopback, the middleware is skipped and a warning is logged — the bind-opt-in hardening lands in a follow-up. - Matching parses each value with net/url and compares scheme://host [:port] with scheme and host lowercased per RFC 6454 §4. Values carrying userinfo (RFC 6454 §6) or that fail to parse never match, closing an Origin-smuggling vector. Requests with multiple Origin headers are rejected outright. - 403 responses carry a JSON-RPC error body (id: null, code -32600, message "Origin not allowed"). - `Access-Control-Allow-Origin: *` removed from the httpsse SSE handler; the wildcard would have neutered any Origin enforcement via preflight response inheritance. Operator CRDs (MCPServer/MCPRemoteProxy/VirtualMCPServer) do not yet expose an allowedOrigins field, and vMCP composes its own middleware chain that does not reference this package. Both are deferred to follow-ups; until then operator/vMCP non-loopback deployments log the WARN above rather than enforcing. Closes audit row 5 (Origin validation absent). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
f2aa438 to
322b09f
Compare
|
Filed the two deferred follow-ups referenced in the threads above:
|
Summary
ToolHive's proxy layer had no
Originheader validation, and the legacy HTTP+SSE transport emittedAccess-Control-Allow-Origin: *, leaving both modes open to DNS-rebinding attacks from browser clients. MCP 2025-11-25 §"Security Warning" requires servers to validateOriginon all connections and respond with 403 when the value is invalid.pkg/transport/middleware/origin/with a factory (for the RunConfig-driven chain used bythv run/thv-proxyrunner/ vMCP) and a bare constructor (for the inline chain used bythv proxy)--allowed-originsflag on boththv runandthv proxy; when empty and the bind is loopback, a default loopback-only allowlist is derived automatically. Non-loopback binds without an explicit list log a warning (bind opt-in hardening follows in a separate PR)Originheaders are rejected outright; 403 responses carry a canonical JSON-RPC error bodyAccess-Control-Allow-Origin: *from the httpsse SSE handler — the wildcard would have neutered the enforcement via preflight response inheritanceCloses audit row 5 (Origin validation absent) from the MCP 2025-11-25 spec-compliance audit.
Type of change
Test plan
task test) —pkg/transport/middleware/origin/...,pkg/runner/...,pkg/transport/proxy/httpsse/...all greentask lint-fix) — 0 issuesthv runandthv proxywith--host 127.0.0.1auto-derive the loopback allowlist;--host 0.0.0.0without--allowed-originslogs the expected warning; explicit--allowed-originstakes precedence over the auto-derivation. Verified the 403 JSON-RPC error body shape with curl + craftedOriginheaders.Does this introduce a user-facing change?
Yes. Two related changes:
--allowed-originsonthv runandthv proxyaccepts a repeatable exact-match list for the HTTPOriginheader. Default behavior for loopback binds preserves existing browser-client flows (same-origin fromlocalhost/127.0.0.1/[::1]is auto-allowed). Non-loopback binds continue to work as before but now log a warning recommending explicit origins.*wildcard on the httpsse/sseendpoint is gone. Any browser client that relied on cross-origin requests to the legacy SSE transport will need to migrate to the streamable transport or be added to--allowed-origins.Implementation plan
Approved implementation plan
PR-1 from the MCP proxy spec-compliance Phase 2 plan (
~/.claude/plans/yes-let-s-plan-phase-abundant-dragonfly.md, personal / not in repo).Scope: implement and wire the Origin-header validation middleware, remove the httpsse
Access-Control-Allow-Origin: *reflector. Uses the bare-middleware + factory pattern modelled onpkg/transport/middleware/header_forward.go. Shared default-allowlist derivation (origin.ResolveAllowedOrigins) keeps the runner andthv proxycall sites from drifting.Deferred to follow-up PRs:
--allow-public-bindopt-in — this PR only adds the warn path through the middleware absenceMCPServerCRD does not yet exposeallowedOriginsfor the operator-reconciled proxyrunner pods; a follow-up PR will add the CRD field and serialize it into runconfigSpecial notes for reviewers
net.ParseIP(host).IsLoopback()so127.0.0.2and similar variants are handled; the earlier literal-switch approach missed them.origin.ResolveAllowedOriginsis exported specifically socmd/thv/app/proxy.go(the inline chain) can share logic withpkg/runner/middleware.go. Doc comment explains the contract.🤖 Generated with Claude Code