Bound binary response size in generated clients

Follow-up to PR #84 review.

## Context

`client.request_bytes` (in generated servers) currently reads the full
response body via `response.content` with no upper bound. For the "help
people build MCP servers fast" default that is the right choice, but a
truly large upstream body (multi-GB PDF, tarball, archive) will OOM
the generated server.

An earlier iteration of PR #84 streamed the body via
`httpx.AsyncClient.stream(...) + aiter_bytes()` with a 32 MiB cap
configurable via `MCP_MAX_BINARY_RESPONSE_BYTES`. That shape was
reverted as too complex for a scaffolding repo, but the underlying
concern remains.

## Open questions

- Where should the cap live — scope YAML (per-tool), server-level
  config, or a hard default baked into the renderer?
- Do we want in-memory reads with an early fail, or stream-to-tempfile
  so the caller can hand MCP a path instead of bytes?
- Avoid the env-var override pattern from the reverted version — if
  configurable, prefer something declared in `mcp-scope.yaml`.

## Non-goals

- Re-introducing `MCP_MAX_BINARY_RESPONSE_BYTES` as-is.
- Any change to the `response_kind` contract.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bound binary response size in generated clients #86

Context

Open questions

Non-goals

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bound binary response size in generated clients #86

Description

Context

Open questions

Non-goals

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions