Skip to content

feat(gateway): add Anthropic Messages API support to openrouter proxy#1122

Open
kilo-code-bot[bot] wants to merge 14 commits intomainfrom
feat/anthropic-messages-api
Open

feat(gateway): add Anthropic Messages API support to openrouter proxy#1122
kilo-code-bot[bot] wants to merge 14 commits intomainfrom
feat/anthropic-messages-api

Conversation

@kilo-code-bot
Copy link
Contributor

@kilo-code-bot kilo-code-bot bot commented Mar 16, 2026

Summary

Adds support for the Anthropic Messages API to the OpenRouter proxy route (/api/openrouter/messages and /api/gateway/messages), in addition to the existing OpenAI chat completions (/chat/completions) and Responses API (/responses) support.

Key changes:

  • types.ts: Added GatewayMessagesRequest type (Anthropic Messages format with model, max_tokens, messages, system, stream, tools, etc.) and extended the GatewayRequest discriminated union with the messages kind.
  • route.ts: Extended validatePath to accept /messages, added body parsing for the messages format, applied the same admin-only guard as the Responses API, handled prompt info extraction and free-model rewriting for the new kind.
  • processUsage.messages.ts (new file): Streaming and non-streaming usage parsing for Anthropic's SSE format (message_start → input tokens, message_delta → output tokens + stop reason), with OpenRouter cost field handling mirroring the existing chat completions and responses parsers.
  • processUsage.ts: Wired the new messages api_kind into countAndStoreUsage.
  • request-helpers.ts: getMaxTokens now handles the messages kind (returns max_tokens).
  • api-metrics.server.ts: getToolsAvailable and getToolsUsed handle Anthropic tool format (tools have a top-level name, tool use appears as tool_use content blocks in assistant messages).
  • abuse-service.ts: extractFullPrompts handles the messages kind (top-level system field + user message content extraction).
  • providers/index.ts: openRouterRequest body parameter type updated to include GatewayMessagesRequest.

The Messages API endpoint is gated behind is_admin (same as the Responses API) while it's experimental.

Verification

  • Manual code review for correctness against the Anthropic Messages API streaming spec and OpenRouter's cost field conventions.
  • Verified the SSE event shape matches Anthropic's streaming format: message_start carries usage.input_tokens, message_delta carries usage.output_tokens.
  • TypeScript typecheck could not be run (tsgo binary unavailable in this environment); please verify during CI.

Visual Changes

N/A

Reviewer Notes

  • The Messages API path (/messages) is forwarded as-is to OpenRouter at ${provider.apiUrl}/messages — OpenRouter supports the native Anthropic Messages API format at this path.
  • Free model response rewriting is a no-op for the messages kind (free models don't use the Anthropic Messages API today), so it falls through to wrapInSafeNextResponse.
  • Custom LLM providers return a 404 for the messages kind (same as responses), since customLlmRequest only handles chat completions.
  • Cache breakpoints and tool call ID normalization in applyAnthropicModelSettings are already guarded behind kind === 'chat_completions'; clients using the native Messages API are responsible for their own cache control markup.

const usage = responseJson?.usage as MessagesApiUsage | undefined;

const responseContent =
responseJson?.content
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Non-streaming error responses throw during usage parsing

When OpenRouter returns an error JSON payload for /messages, content is usually missing. The unguarded .filter().map().join() chain throws before hasError can be recorded, so failed non-streaming requests skip usage accounting and can bubble an unexpected exception out of countAndStoreUsage.

@kilo-code-bot
Copy link
Contributor Author

kilo-code-bot bot commented Mar 16, 2026

Code Review Summary

Status: 8 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 8
SUGGESTION 0

Fix these issues in Kilo Cloud

Issue Details (click to expand)

WARNING

File Line Issue
src/app/api/openrouter/[...path]/route.ts 123 Invalid /messages payloads can crash before a 400 is returned
src/app/api/openrouter/[...path]/route.ts 404 Non-standard Messages fields are sent to Anthropic-compatible backends
src/app/api/openrouter/[...path]/route.ts 445 /messages can still hit gateways that only support chat-completions
src/lib/abuse-service.ts 42 Tool-result follow-up turns drop the prompt seen by abuse classification
src/lib/processUsage.messages.ts 37 Vercel-routed Messages requests are recorded with zero cost
src/lib/processUsage.messages.ts 127 Streamed Messages errors are recorded as successful usage
src/lib/processUsage.messages.ts 215 Non-streaming error responses throw during usage parsing
src/lib/processUsage.messages.ts 247 Follow-up tool calls lose the logged user prompt
Other Observations (not in diff)

None.

Files Reviewed (13 files)
  • package.json - 0 issues
  • src/app/api/openrouter/[...path]/route.ts - 3 issues
  • src/lib/abuse-service.ts - 1 issue
  • src/lib/kilo-auto-model.ts - 0 issues
  • src/lib/o11y/api-metrics.server.ts - 0 issues
  • src/lib/processUsage.messages.ts - 4 issues
  • src/lib/processUsage.ts - 0 issues
  • src/lib/providers/anthropic.ts - 0 issues
  • src/lib/providers/index.ts - 0 issues
  • src/lib/providers/openrouter/request-helpers.ts - 0 issues
  • src/lib/providers/openrouter/types.ts - 0 issues
  • src/lib/providers/vercel/index.ts - 0 issues
  • src/lib/providers/xai.ts - 0 issues

: Array.isArray(systemContent)
? systemContent.map(b => b.text).join('\n')
: null;
const lastUserMessage = request.body.messages.filter(m => m.role === 'user').at(-1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Tool-result turns erase the user prompt for abuse checks

Anthropic tool loops send tool results as a user message. When that is the latest user turn, this picks it, filters out the non-text blocks, and returns null even though an earlier user text prompt is still in history. That leaves the abuse classifier blind on follow-up tool calls.

const cacheWriteTokens = usage?.cache_creation_input_tokens ?? 0;

// OpenRouter path: cost fields are present directly in usage
if (usage?.cost != null || usage?.is_byok != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Vercel-routed Messages requests are recorded with zero cost

This only understands OpenRouter's extra usage.cost / is_byok fields. The PR also allows GatewayMessagesRequest to route through Vercel for BYOK, and Anthropic-compatible responses there only carry token counts, so these requests fall through to the zero-cost path and market_cost is underreported.

? body.system.map(b => b.text).join('\n')
: '';

const lastUserMessage = body.messages.filter(m => m.role === 'user').at(-1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Follow-up tool calls lose the logged user prompt

The Messages API encodes tool results as a user message with tool_result blocks. When that message is last, this extractor emits an empty prefix instead of the previous natural-language user turn, so prompt logging and downstream analytics lose the actual request text.

const resolved = resolveAutoModel(model, modeHeader);
request.body.model = resolved.model;
if (resolved.reasoning) request.body.reasoning = resolved.reasoning;
if (resolved.reasoning && request.kind === 'chat_completions') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Auto-model reasoning is dropped on /messages

resolveAutoModel() still returns reasoning for modes like plan, general, and debug, but this guard only copies it into chat-completions requests. GatewayMessagesRequest already has a thinking field, so /messages requests using kilo-auto/frontier or kilo-auto/balanced will silently lose the extra reasoning configuration and behave differently from the other gateway APIs.

const userId = generateProviderSpecificHash(user.id, provider);
if (requestBodyParsed.kind === 'messages') {
requestBodyParsed.body.metadata = { user_id: userId };
requestBodyParsed.body.user = userId;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Non-standard Messages fields are sent to Anthropic-compatible backends

GatewayMessagesRequest is widened with user/session_id, but Anthropic's Messages request schema does not define either field. getProvider() can still route /messages through Vercel/BYOK, so this branch will forward those keys to Anthropic-compatible /messages endpoints, where they can be rejected as unknown parameters even when the model otherwise supports the Messages API.

let messageId: string | null = null;
let model: string | null = null;
let responseContent = '';
const reportedError = statusCode >= 400;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Streaming error events are recorded as successful usage

reportedError is fixed from the initial HTTP status and never flips once SSE processing starts. Anthropic-style /messages streams can surface failures as in-band type: "error" events after the response has already started, so this path still returns hasError: false and logs the request as a successful completion.

return;
}

//if (json.type === 'error') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Streamed Messages errors are recorded as successful usage

Anthropic can send { type: 'error' } inside a 200 SSE response. With this branch commented out, reportedError never flips to true, so failed /messages streams can still be logged with hasError: false and look like successful requests in usage tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant