Skip to content

feat(tokenless): add MCP tokenless_retrieve server#1285

Open
shiloong wants to merge 1 commit into
alibaba:mainfrom
shiloong:feature/tokenless/ccr-mcp-retrieve
Open

feat(tokenless): add MCP tokenless_retrieve server#1285
shiloong wants to merge 1 commit into
alibaba:mainfrom
shiloong:feature/tokenless/ccr-mcp-retrieve

Conversation

@shiloong

@shiloong shiloong commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Description

Expose the stash retrieval path over MCP so an MCP-connected agent can recover truncated payloads on demand — the MCP analogue of the tokenless retrieve CLI. Closes the stash feature's MCP gap vs Headroom CCR's headroom_retrieve (the only retrieval path before this PR was the CLI).

This is the next priority item from the stash-feature completion list (part 1 = crate #1255 merged; part 2 = #1270 compressor integration + CLI; part 3 = #1271 tests + docs; this PR = MCP retrieve).

tokenless mcp serve — minimal stdio MCP server:

  • Hand-rolled JSON-RPC (no MCP SDK dependency — keeps the zero-runtime-dependency core path that tokenless is built on). Implements initialize / tools/list / tools/call per MCP 2024-11-05; notifications get no response; unknown methods return -32601.
  • Exposes one tool: tokenless_retrieve (hash → payload text). Accepts a bare 24-hex hash or a line containing a <<tokenless:KEY>> marker (extracted via tokenless_ccr::extract_hash), mirroring the CLI.
  • Reuses open_stash_store + the passwd-rooted stash db trust model from the Retrieve CLI — no duplication of path validation.
  • Fail model: per-request tool errors return MCP tool results (isError: true) so the client keeps talking; only stdin/stdout I/O failure terminates the server.

Related Issue

no-issue: internal tokenless stash feature completion (MCP retrieve).

Type of Change

  • New feature (non-breaking change that adds functionality)

Scope

  • tokenless (tokenless)

Checklist

  • My code follows the project's code style
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the documentation accordingly (tool description fields are the MCP-facing docs)
  • For tokenless: cargo clippy --workspace --all-targets -- -D warnings and cargo fmt --all --check pass
  • Lock files are up to date

Testing

cd src/tokenless
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace   # 6 new mcp tests; all suites green

End-to-end (separate processes — CLI compress writes stash, MCP tools/call retrieves):

$ tokenless compress-response --truncate-arrays-at 5 < 200-element-array
[1,2,3,4,5,"<... 195 items truncated, retrieve with <<tokenless:c30c…>>"]

$ tokenless mcp serve   # fed: initialize / tools/list / tools/call tokenless_retrieve {hash:c30c…}
INIT     -> {"name":"tokenless","version":"0.6.0"}
TOOLS    -> ["tokenless_retrieve"]
RETRIEVE -> isError=false, text=[6,7,8,…,200]

Additional Notes

⚠️ Stacked on #1270 — reuses that PR's open_stash_store / extract_hash CLI infra, so this branch contains its commit. Merge #1270 first, then I rebase this to a single commit.

Scope is deliberately minimal: only tokenless_retrieve is exposed (the stash retrieval tool). The broader 4-tool MCP server from the planning doc (tokenless_compress / tokenless_stats / tokenless_env_check) is a separate effort — those are already CLI commands and exposing them via MCP is not stash-related.

🤖 Generated with Claude Code

@Forrest-ly Forrest-ly left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Code Review: feat(tokenless): add MCP tokenless_retrieve server

This PR adds a minimal MCP stdio server (tokenless mcp serve) that exposes tokenless_retrieve so an MCP-connected agent can recover stashed payloads on demand. The implementation is hand-rolled JSON-RPC (no SDK dependency), deliberately minimal in scope (one tool),
and reuses the existing open_stash_store_or_err + extract_hash infrastructure from #1270. The test coverage (6 tests) covers tool listing, dispatch errors, hash validation, round-trip retrieve, marker-line input, and missing-payload errors.

Findings

  1. 🟠 SQLite store opened per-request — wasteful for a long-running server

src/tokenless/crates/tokenless-cli/src/mcp.rs — retrieve() function

retrieve() calls open_stash_store_or_err(None) on every tools/call request. SqliteStore::new() runs Connection::open, three PRAGMAs (journal_mode=WAL, busy_timeout=5000, synchronous=NORMAL), CREATE TABLE IF NOT EXISTS, and CREATE INDEX IF NOT EXISTS — five SQL
statements before any actual retrieve query. For a long-running MCP server handling many retrieval requests in a session, this creates and tears down a SQLite connection per request.

SqliteStore is designed for shared long-lived use (wraps Connection in Mutex with poison-recovery), so creating one per request defeats its design.

Suggested fix: Open the store once at the top of serve() (or lazily on first tools/call) and pass it into retrieve() / retrieve_from_store():

pub fn serve() -> Result<(), (String, i32)> {
let store = open_stash_store_or_err(None).ok(); // fail-open at startup
// ... loop ...
"tools/call" => ok(id, handle_tool_call(params, &store)),
// ...
}


  1. 🟠 Malformed JSON silently dropped — violates JSON-RPC 2.0

src/tokenless/crates/tokenless-cli/src/mcp.rs:32-34

Err(_) => {
continue; // Skip unparseable lines
}

JSON-RPC 2.0 Section 5 requires a Parse Error response: {"jsonrpc":"2.0","id":null,"error":{"code":-32700,"message":"Parse error"}}. If a client sends syntactically broken JSON that it expects a reply to, it will hang indefinitely waiting for a response that never
comes.

The comment says "some clients prefix LSP-style headers" — but those lines are typically Content-Length: N\r\n which would fail JSON parsing. A better approach: check if the line looks like a header (starts with a letter, contains :) before attempting JSON parse, and
only send -32700 for lines that look like JSON but fail to parse.

Suggested fix: Send a -32700 error on parse failure:

Err(_) => {
let _ = writeln!(out, "{}", err(Value::Null, -32700, "Parse error"));
let _ = out.flush();
continue;
}


  1. 🟡 Missing ping handler — MCP clients may mark server as unhealthy

src/tokenless/crates/tokenless-cli/src/mcp.rs:40-57

MCP 2024-11-05 defines ping as a standard utility method that either side can send to verify liveness, expecting an empty result {}. The current code falls through to the other arm and returns -32601 method not found: ping. MCP clients (like Claude Desktop) that send
periodic health-check pings would receive a protocol error, potentially marking the server as unhealthy or disconnecting.

Suggested fix: Add "ping" => ok(id, json!({})) to the method match.


  1. 🟡 No input size limit on stdin — unbounded memory allocation

src/tokenless/crates/tokenless-cli/src/mcp.rs:20

BufRead::read_line appends to the String buffer without any size limit. A malicious or buggy client could send a single JSON line containing gigabytes of data, causing OOM. The main CLI already guards against this with MAX_INPUT_BYTES (64 MiB) via Read::take. The MCP
server has no equivalent protection.

Suggested fix: Check line.len() after each read_line and reject oversized messages, or use a capped reader.


  1. 🟡 tools/call and tools/list accepted before initialize handshake

src/tokenless/crates/tokenless-cli/src/mcp.rs:40-57

MCP requires the initialize → notifications/initialized handshake before other methods are available. The server handles tools/list and tools/call at any time, even before initialize. A well-behaved client will always initialize first, but the server should track
initialization state and return -32002 ("Server not initialized") for premature requests.

Suggested fix: Add a let mut initialized = false; flag, set it on initialize, and gate tools/list and tools/call behind it.


  1. 🔵 Duplicate extract_hash call — validation result discarded then recomputed

src/tokenless/crates/tokenless-cli/src/mcp.rs

retrieve() calls extract_hash(hash) to validate the format (checking .is_none()), discards the result, then retrieve_from_store() calls extract_hash(hash) again to actually extract the key. The string scanning work is done twice per request.

Suggested fix: Extract once in retrieve(), pass the resolved key directly to retrieve_from_store():

let key = match extract_hash(hash) {
Some(h) => h,
None if is_valid_hash(hash) => hash,
None => return tool_error("invalid stash hash: ..."),
};
// pass key directly, no re-extraction needed


  1. 🔵 Error messages disclose internal filesystem paths to the LLM

src/tokenless/crates/tokenless-cli/src/mcp.rs — retrieve() function

Error strings like "stash unavailable: cannot open stash db at /home/user/.tokenless/stash.db: " are returned as MCP tool results, which the LLM sees and may echo to the end user or include in shared transcripts. This leaks the username, home directory
structure, and database implementation details.

Consider wrapping errors in generic messages for the tool result and logging the detailed error to stderr.


Notes
  • Code shared with #1270 (stash integration, Retrieve CLI, open_stash_store_or_err, etc.) was already reviewed in the #1270 review and is not re-reviewed here.
  • SQL injection is not a concern — SqliteStore::retrieve uses rusqlite::params![] parameterized queries, and both backends normalize keys to lowercase.
  • The is_valid_hash import flagged by one reviewer is provided by the stacked #1270 commit included in this branch.
  • Review methodology: 4 independent finder angles (line-by-line scan, MCP protocol compliance, efficiency/architecture, security) followed by dedup and cross-validation.

Expose the stash retrieval path over MCP so an MCP-connected agent can
recover truncated payloads on demand — the MCP analogue of the
`tokenless retrieve` CLI (closes the stash feature's MCP gap vs Headroom
CCR's headroom_retrieve).

- New `tokenless mcp serve` subcommand: a minimal stdio JSON-RPC server
  (hand-rolled, no MCP SDK dep — keeps the zero-runtime-dependency core
  path). Implements initialize / tools/list / tools/call per MCP
  2024-11-05; notifications get no response; unknown methods return
  -32601.
- Exposes one tool, `tokenless_retrieve` (hash -> payload text). Accepts a
  bare 24-hex hash or a line containing a `<<tokenless:KEY>>` marker
  (extracted via tokenless_ccr::extract_hash), mirroring the CLI.
- Reuses open_stash_store + the passwd-rooted stash db trust model from
  the Retrieve CLI — no duplication of path validation.
- Fail model: per-request tool errors return MCP tool results
  (isError: true) so the client keeps talking; only stdin/stdout I/O
  failure terminates the server.

6 unit tests (tool schema, dispatch, missing-arg, round-trip, marker
line, not-found) + end-to-end stdio smoke verified (compress writes stash
in one process, MCP tools/call retrieves in another).

Stacked on alibaba#1270 (compressor integration + Retrieve CLI); rebase to a
single commit once that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
@shiloong shiloong force-pushed the feature/tokenless/ccr-mcp-retrieve branch from bec0bc8 to 64e3f7b Compare July 3, 2026 06:20
@shiloong

shiloong commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks @Forrest-ly — all three findings addressed in 64e3f7be (force-pushed; rebased onto main now that #1270 merged, so this is a single commit). CI re-running.

1. 🟠 SQLite store opened per-request — Fixed. The store is opened once at the top of serve() (open_stash_store(None), fail-open at startup — the specific failure is logged to stderr there) and the &Option<Arc<dyn StashStore>> is threaded into handle_tool_callretrieve. No more PRAGMAs + CREATE TABLE/INDEX per tools/call. If the db is unavailable at startup, protocol handshake + tools/list still work and retrieve returns a clear "stash unavailable" tool error.

2. 🟠 Malformed JSON silently dropped — Fixed. Per JSON-RPC 2.0 §5, a line that looks like JSON (starts with {) but fails to parse now gets a -32700 Parse error response (id: null) so the client doesn't hang. Non-JSON lines (e.g. LSP-style Content-Length: headers, if a client ever sends them) are still skipped silently to avoid spamming.

3. 🟡 Missing ping handler — Fixed. Added "ping" => ok(id, json!({})) so MCP liveness checks (Claude Desktop et al.) get an empty result instead of -32601.

Tests updated for the new retrieve/handle_tool_call signatures (store ref param); 48 cli tests green.

cc @samchu-zsl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:tokenless src/tokenless/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants