Skip to content

Memory LLM completion lane bypasses the host model catalog (cannot honor OpenRouter provider routing or other model config) #901

Description

@gorkem2020

Summary

memory-lancedb-pro's LLM completion lane builds a minimal OpenAI chat-completions request in its own client (src/llm-client.ts, createApiKeyClient) and posts it directly to the configured llm.baseURL. This bypasses the host model catalog and runtime entirely. When the base URL is OpenRouter, there is no way to apply OpenRouter's provider routing object, or any other catalog-level model configuration, so OpenRouter free-routes the model across all providers.

Where

src/llm-client.ts, createApiKeyClient(). The request carries only model, messages, temperature (plus a conditional chat_template_kwargs). The same client backs every completion lane (smart extraction, reflection, dedup/merge, admission-control, the memory upgrader).

Impact

On OpenRouter, an open-weight model such as openai/gpt-oss-120b is served by many providers with very different latency, throughput, quality, and rate limits. Because the plugin's calls carry no provider preference, they free-route (observed across SiliconFlow, DeepInfra, Together, Amazon Bedrock, NovitaAI, and Weights & Biases). A latency-sensitive memory and reflection lane cannot be pinned to fast providers (for example Cerebras or Groq), cannot avoid providers that return 429s, and cannot be attributed to the host app in OpenRouter's dashboard. Provider routing is only the first example: any catalog-level model setting (sampling params, request headers, request policy) is equally unavailable to this lane.

Proposed direction

Rather than adding a one-off passthrough per setting, let the completion lane route through the host managed LLM runtime using a catalog model reference, the way other OpenClaw plugins already do (for example lossless-claw's summaryModel and expansionModel, which set model: <ref> and let the host resolve and complete it). The plugin already holds a host-runtime handle (api.runtime.agent) for its recall sub-agent, so the completion lane could use the same managed path when running under OpenClaw with a catalog model reference, and fall back to the existing standalone client for CLI and non-host usage.

Benefit: the full catalog entry applies automatically (provider routing, sampling params, headers, auth profiles, app attribution, and anything added later), instead of re-patching the client for each new field.

OpenRouter provider routing reference: https://openrouter.ai/docs/features/provider-routing

Environment

  • memory-lancedb-pro 1.1.0-beta.11 (master)
  • OpenClaw 2026.6.1, host-native, Node 24
  • Provider: OpenRouter (chat-completions)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions