Skip to content

Bound binary response size in generated clients #86

@lorr1

Description

@lorr1

Follow-up to PR #84 review.

Context

client.request_bytes (in generated servers) currently reads the full
response body via response.content with no upper bound. For the "help
people build MCP servers fast" default that is the right choice, but a
truly large upstream body (multi-GB PDF, tarball, archive) will OOM
the generated server.

An earlier iteration of PR #84 streamed the body via
httpx.AsyncClient.stream(...) + aiter_bytes() with a 32 MiB cap
configurable via MCP_MAX_BINARY_RESPONSE_BYTES. That shape was
reverted as too complex for a scaffolding repo, but the underlying
concern remains.

Open questions

  • Where should the cap live — scope YAML (per-tool), server-level
    config, or a hard default baked into the renderer?
  • Do we want in-memory reads with an early fail, or stream-to-tempfile
    so the caller can hand MCP a path instead of bytes?
  • Avoid the env-var override pattern from the reverted version — if
    configurable, prefer something declared in mcp-scope.yaml.

Non-goals

  • Re-introducing MCP_MAX_BINARY_RESPONSE_BYTES as-is.
  • Any change to the response_kind contract.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions