Memory LLM completion lane bypasses the host model catalog (cannot honor OpenRouter provider routing or other model config)

## Summary

`memory-lancedb-pro`'s LLM completion lane builds a minimal OpenAI chat-completions request in its own client (`src/llm-client.ts`, `createApiKeyClient`) and posts it directly to the configured `llm.baseURL`. This bypasses the host model catalog and runtime entirely. When the base URL is OpenRouter, there is no way to apply OpenRouter's `provider` routing object, or any other catalog-level model configuration, so OpenRouter free-routes the model across all providers.

## Where

`src/llm-client.ts`, `createApiKeyClient()`. The request carries only `model`, `messages`, `temperature` (plus a conditional `chat_template_kwargs`). The same client backs every completion lane (smart extraction, reflection, dedup/merge, admission-control, the memory upgrader).

## Impact

On OpenRouter, an open-weight model such as `openai/gpt-oss-120b` is served by many providers with very different latency, throughput, quality, and rate limits. Because the plugin's calls carry no provider preference, they free-route (observed across SiliconFlow, DeepInfra, Together, Amazon Bedrock, NovitaAI, and Weights & Biases). A latency-sensitive memory and reflection lane cannot be pinned to fast providers (for example Cerebras or Groq), cannot avoid providers that return 429s, and cannot be attributed to the host app in OpenRouter's dashboard. Provider routing is only the first example: any catalog-level model setting (sampling params, request headers, request policy) is equally unavailable to this lane.

## Proposed direction

Rather than adding a one-off passthrough per setting, let the completion lane route through the host managed LLM runtime using a catalog model reference, the way other OpenClaw plugins already do (for example [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s `summaryModel` and `expansionModel`, which set `model: <ref>` and let the host resolve and complete it). The plugin already holds a host-runtime handle (`api.runtime.agent`) for its recall sub-agent, so the completion lane could use the same managed path when running under OpenClaw with a catalog model reference, and fall back to the existing standalone client for CLI and non-host usage.

Benefit: the full catalog entry applies automatically (provider routing, sampling params, headers, auth profiles, app attribution, and anything added later), instead of re-patching the client for each new field.

OpenRouter provider routing reference: https://openrouter.ai/docs/features/provider-routing

## Environment

- memory-lancedb-pro 1.1.0-beta.11 (master)
- OpenClaw 2026.6.1, host-native, Node 24
- Provider: OpenRouter (chat-completions)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory LLM completion lane bypasses the host model catalog (cannot honor OpenRouter provider routing or other model config) #901

Summary

Where

Impact

Proposed direction

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Memory LLM completion lane bypasses the host model catalog (cannot honor OpenRouter provider routing or other model config) #901

Description

Summary

Where

Impact

Proposed direction

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions