feat(tokenless): integrate stash into response compression + retrieve CLI#1270
Conversation
Forrest-ly
left a comment
There was a problem hiding this comment.
Code Review: feat(tokenless): integrate stash into response compression + retrieve CLI
This PR wires the reversible stash from tokenless-ccr into ResponseCompressor so truncated array items can be retrieved via a new tokenless retrieve CLI subcommand. The stash is optional and fail-open — compression proceeds without stash on any error, falling back to
the original lossy truncation marker. The implementation is well-structured: the schema crate pulls only the trait (no SQLite), and the CLI supplies the concrete backend.
Findings
- 🟠 Stash writes orphaned when compressed output is discarded
src/tokenless/crates/tokenless-cli/src/main.rs:454
compressor.compress(&value) (which now writes stash entries inside compress_array) runs before the no-savings check (after_tokens >= before_tokens, line 385) and the dry-run check (compression_on == false, line 398). In both cases the compressed output containing
<tokenless:HASH> markers is thrown away and the original input is emitted. The stash entries are written to SQLite but their markers are never seen by the LLM, producing unretrievable orphans.
While TTL expiry eventually cleans them up, this is unnecessary I/O on every dry-run invocation that truncates arrays.
Suggested fix: check compression_on and/or estimate token savings before calling compress(), or skip with_stash_store() when compression is disabled.
- 🟠 Asymmetric error handling: --stash-db rejection disables stash, TOKENLESS_STASH_DB rejection falls back to default
src/tokenless/crates/tokenless-cli/src/main.rs:313
In get_stash_db_path():
- --stash-db fails validate_db_path → returns None (stash completely disabled)
- TOKENLESS_STASH_DB fails the same validation → falls back to ~/.tokenless/stash.db
A typo in --stash-db (e.g. --stash-db /tmpp/stash.db) silently disables the stash with only a stderr warning — the user loses reversibility without a hard error. There is no established precedent for this "strict on flag, lenient on env var" pattern in the existing
get_db_path().
Suggested fix: either fall back to the default path on both failure paths, or surface a hard error on both.
- 🟡 Retrieve reuses fail-open open_stash_store() but needs fail-hard semantics
src/tokenless/crates/tokenless-cli/src/main.rs:498
open_stash_store() was designed for the compression path where silent degradation is correct. Reusing it for the user-initiated Retrieve subcommand means the user gets a generic error:
stash unavailable: no trusted home directory or cannot open stash db
…that may not match the actual failure (corrupt DB, path rejected, etc.). The eprintln diagnostics do print to stderr, but programmatic consumers (scripts checking exit code + stdout) see only the generic message.
Suggested fix: add an open_stash_store_or_err() -> Result<Arc, String> that returns the specific cause, and use it in Retrieve.
- 🟡 stash_dropped silently swallows stash write errors with no diagnostic
src/tokenless/crates/tokenless-schema/src/response_compressor.rs:241
stash.stash(&payload).ok() // StashError → None, no logging
Any StashError (disk full, locked DB, I/O failure) is silently converted to None. The truncation marker degrades to the non-retrievable form. A persistent backend failure (e.g. full disk) would be invisible until a user tries to retrieve and gets a miss.
Since this is library code, avoiding eprintln may be intentional — but consider returning Result from stash_dropped so the CLI layer can choose whether to log.
- 🟡 Retrieve accepts arbitrary strings without hash format validation
src/tokenless/crates/tokenless-cli/src/main.rs:510
let key = extract_hash(&hash).unwrap_or(hash.as_str());
When extract_hash returns None (no <tokenless:...> marker found), the raw input is passed directly to store.retrieve() without checking it's a valid 24-hex-char key. A user who mistakenly passes a file path gets:
no stashed payload for hash: /some/path
…instead of a clear format error.
Suggested fix: validate the hex format before the DB round-trip.
- 🔵 Stashed items preserve fields the compressor would strip (intentional, but note data-hygiene implication)
src/tokenless/crates/tokenless-schema/src/response_compressor.rs:219
The dropped slice &arr[self.truncate_arrays_at..] captures original values before compress_value runs. Kept items have drop_fields (debug/stacktrace) stripped, nulls removed, and max_depth enforced — but stashed items preserve everything.
The PR description explicitly acknowledges this ("retrieval yields the original content verbatim"), so this is a design decision. However, if drop_fields serves a data-hygiene purpose (stripping PII or internal debug info), those fields survive in the stash DB and are
returned verbatim by Retrieve. Worth a comment in the code documenting this as intentional.
Review methodology
Medium-effort review: 6 independent finder angles (line-by-line scan, removed-behavior audit, cross-file trace, reuse/simplification/efficiency, altitude, conventions) each producing up to 6 candidates, followed by 1-vote verification per candidate.
… CLI Wire the reversible stash from tokenless-ccr into the response compression path so dropped array items become retrievable instead of permanently lost. ResponseCompressor: - Add optional stash_store: Option<Arc<dyn StashStore>>. None (the default) preserves the original lossy truncation path — zero core-path impact. - with_stash_store builder; compress_array stashes the raw dropped tail and embeds a <<tokenless:KEY>> marker. On stash absence/failure the marker degrades to the plain "<... N more items truncated>" form (fail-open). - Tests cover lossy (no stash), round-trip (stash + retrieve), and the failing-backend fallback. CLI: - CompressResponse gains --no-stash (opt out) and --stash-db (override path). By default a SqliteStore is opened at ~/.tokenless/stash.db and injected; open_stash_store fails open to None on any error so compression proceeds. - New Retrieve subcommand: reads the stash db, accepts either a bare 24-hex hash or a line containing a <<tokenless:HASH>> marker (extracted via tokenless_ccr::extract_hash). Retrieve is user-initiated, so failures surface as errors rather than fail-open. - Stash db path resolution mirrors the stats db trust model: passwd-rooted home, TOKENLESS_STASH_DB / --stash-db validated against the trusted home (canonicalize + starts_with) so an attacker cannot redirect the stash to a system-critical location. SchemaCompressor description truncation is intentionally left lossy: a stash marker (~65 chars) against 256/160-char description limits would be 25-40% overhead. Array truncation is the high-value case and is covered here. End-to-end verified across processes: compress writes to stash.db, a separate retrieve invocation recovers the original dropped items. fmt/clippy/test green across the workspace. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
2a4a781 to
3eefed4
Compare
Expose the stash retrieval path over MCP so an MCP-connected agent can recover truncated payloads on demand — the MCP analogue of the `tokenless retrieve` CLI (closes the stash feature's MCP gap vs Headroom CCR's headroom_retrieve). - New `tokenless mcp serve` subcommand: a minimal stdio JSON-RPC server (hand-rolled, no MCP SDK dep — keeps the zero-runtime-dependency core path). Implements initialize / tools/list / tools/call per MCP 2024-11-05; notifications get no response; unknown methods return -32601. - Exposes one tool, `tokenless_retrieve` (hash -> payload text). Accepts a bare 24-hex hash or a line containing a `<<tokenless:KEY>>` marker (extracted via tokenless_ccr::extract_hash), mirroring the CLI. - Reuses open_stash_store + the passwd-rooted stash db trust model from the Retrieve CLI — no duplication of path validation. - Fail model: per-request tool errors return MCP tool results (isError: true) so the client keeps talking; only stdin/stdout I/O failure terminates the server. 6 unit tests (tool schema, dispatch, missing-arg, round-trip, marker line, not-found) + end-to-end stdio smoke verified (compress writes stash in one process, MCP tools/call retrieves in another). Stacked on alibaba#1270 (compressor integration + Retrieve CLI); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Add stash observability so compression stats carry how many stash writes each compress triggered and the live stash entry count — the gauge for tuning stash TTL/capacity and verifying the reversible path is exercised in production. ResponseCompressor (schema crate): - Add an AtomicUsize stash-write counter, reset at the start of each compress() and incremented on every successful stash_dropped(). Expose stash_writes() so the CLI can read it without the schema crate depending on the stats crate (no new cross-crate coupling). StatsRecord (stats crate): - Add nullable stash_writes / stash_size columns (Option<i64>) + with_stash() builder. Migration extends the existing pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so old DBs gain the columns at first open; pre-stash records read back as None. INSERT/SELECT/row_to_record updated. CLI: - compress-response reads compressor.stash_writes() + stash.len() after compress and records them; compress-schema / compress-toon pass None (no stash path). record_compression_stats gains the two params. retrieve-side hits/misses are intentionally not added here — retrieve is not a compression operation and would need a new OperationType + schema change; deferred until there's a retrieve-stats use case. 5 new tests (counter zero/resets/round-trip + stats round-trip + default-None). fmt/clippy(--all-targets)/test green. Stacked on alibaba#1270 (compressor integration); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
|
Thanks @Forrest-ly — findings #1, #2, #3, #5, #6 addressed in 1. 🟠 Orphaned stash writes when output discarded — Fixed per your "skip 2. 🟠 Asymmetric --stash-db / TOKENLESS_STASH_DB handling — Fixed. 3. 🟡 Retrieve fail-open vs fail-hard — Fixed. Added 5. 🟡 Retrieve accepts arbitrary strings without hash format validation — Fixed. Exposed 6. 🔵 Stashed items preserve fields (data-hygiene note) — Fixed with an inline comment in 4. 🟡 cc @samchu-zsl |
Add stash observability so compression stats carry how many stash writes each compress triggered and the live stash entry count — the gauge for tuning stash TTL/capacity and verifying the reversible path is exercised in production. ResponseCompressor (schema crate): - Add an AtomicUsize stash-write counter, reset at the start of each compress() and incremented on every successful stash_dropped(). Expose stash_writes() so the CLI can read it without the schema crate depending on the stats crate (no new cross-crate coupling). StatsRecord (stats crate): - Add nullable stash_writes / stash_size columns (Option<i64>) + with_stash() builder. Migration extends the existing pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so old DBs gain the columns at first open; pre-stash records read back as None. INSERT/SELECT/row_to_record updated. CLI: - compress-response reads compressor.stash_writes() + stash.len() after compress and records them; compress-schema / compress-toon pass None (no stash path). record_compression_stats gains the two params. retrieve-side hits/misses are intentionally not added here — retrieve is not a compression operation and would need a new OperationType + schema change; deferred until there's a retrieve-stats use case. 5 new tests (counter zero/resets/round-trip + stats round-trip + default-None). fmt/clippy(--all-targets)/test green. Stacked on alibaba#1270 (compressor integration); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Expose the stash retrieval path over MCP so an MCP-connected agent can recover truncated payloads on demand — the MCP analogue of the `tokenless retrieve` CLI (closes the stash feature's MCP gap vs Headroom CCR's headroom_retrieve). - New `tokenless mcp serve` subcommand: a minimal stdio JSON-RPC server (hand-rolled, no MCP SDK dep — keeps the zero-runtime-dependency core path). Implements initialize / tools/list / tools/call per MCP 2024-11-05; notifications get no response; unknown methods return -32601. - Exposes one tool, `tokenless_retrieve` (hash -> payload text). Accepts a bare 24-hex hash or a line containing a `<<tokenless:KEY>>` marker (extracted via tokenless_ccr::extract_hash), mirroring the CLI. - Reuses open_stash_store + the passwd-rooted stash db trust model from the Retrieve CLI — no duplication of path validation. - Fail model: per-request tool errors return MCP tool results (isError: true) so the client keeps talking; only stdin/stdout I/O failure terminates the server. 6 unit tests (tool schema, dispatch, missing-arg, round-trip, marker line, not-found) + end-to-end stdio smoke verified (compress writes stash in one process, MCP tools/call retrieves in another). Stacked on alibaba#1270 (compressor integration + Retrieve CLI); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Add stash observability so compression stats carry how many stash writes each compress triggered and the live stash entry count — the gauge for tuning stash TTL/capacity and verifying the reversible path is exercised in production. ResponseCompressor (schema crate): - Add an AtomicUsize stash-write counter, reset at the start of each compress() and incremented on every successful stash_dropped(). Expose stash_writes() so the CLI can read it without the schema crate depending on the stats crate (no new cross-crate coupling). StatsRecord (stats crate): - Add nullable stash_writes / stash_size columns (Option<i64>) + with_stash() builder. Migration extends the existing pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so old DBs gain the columns at first open; pre-stash records read back as None. INSERT/SELECT/row_to_record updated. CLI: - compress-response reads compressor.stash_writes() + stash.len() after compress and records them; compress-schema / compress-toon pass None (no stash path). record_compression_stats gains the two params. retrieve-side hits/misses are intentionally not added here — retrieve is not a compression operation and would need a new OperationType + schema change; deferred until there's a retrieve-stats use case. 5 new tests (counter zero/resets/round-trip + stats round-trip + default-None). fmt/clippy(--all-targets)/test green. Stacked on alibaba#1270 (compressor integration); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Add stash observability so compression stats carry how many stash writes each compress triggered and the live stash entry count — the gauge for tuning stash TTL/capacity and verifying the reversible path is exercised in production. ResponseCompressor (schema crate): - Add an AtomicUsize stash-write counter, reset at the start of each compress() and incremented on every successful stash_dropped(). Expose stash_writes() so the CLI can read it without the schema crate depending on the stats crate (no new cross-crate coupling). StatsRecord (stats crate): - Add nullable stash_writes / stash_size columns (Option<i64>) + with_stash() builder. Migration extends the existing pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so old DBs gain the columns at first open; pre-stash records read back as None. INSERT/SELECT/row_to_record updated. CLI: - compress-response reads compressor.stash_writes() + stash.len() after compress and records them; compress-schema / compress-toon pass None (no stash path). record_compression_stats gains the two params. retrieve-side hits/misses are intentionally not added here — retrieve is not a compression operation and would need a new OperationType + schema change; deferred until there's a retrieve-stats use case. 5 new tests (counter zero/resets/round-trip + stats round-trip + default-None). fmt/clippy(--all-targets)/test green. Stacked on alibaba#1270 (compressor integration); rebase to a single commit once that lands. Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Description
Wires the reversible stash from
tokenless-ccr(merged in #1255) into the response compression path so dropped array items become retrievable instead of permanently lost. This is part 2 of the stash feature (part 1 = the crate; part 3 = tests + docs, opened as a stacked PR on top of this one).ResponseCompressor (
crates/tokenless-schema):stash_store: Option<Arc<dyn StashStore>>.None(the default) preserves the original lossy truncation path — zero core-path impact.with_stash_store()builder;compress_arraystashes the raw dropped tail and embeds a<<tokenless:KEY>>marker. On stash absence/failure the marker degrades to the plain<... N more items truncated>form (fail-open).tokenless-ccrwithdefault-features = false— pulls only the trait + key + marker, norusqlite(the CLI supplies the concrete backend).CLI (
crates/tokenless-cli):compress-responsegains--no-stash(opt out) and--stash-db(override path). By default aSqliteStoreis opened at~/.tokenless/stash.dband injected;open_stash_storefails open toNoneon any error so compression proceeds without stash.Retrievesubcommand: reads the stash db, accepts either a bare 24-hex hash or a line containing a<<tokenless:HASH>>marker (extracted viatokenless_ccr::extract_hash). Retrieve is user-initiated, so failures surface as errors rather than fail-open.$HOME),TOKENLESS_STASH_DB/--stash-dbvalidated via the existingvalidate_db_path(canonicalize +starts_withhome) so an attacker cannot redirect the stash to a system-critical location. Reusedvalidate_db_pathrather than duplicating.Scope decision:
SchemaCompressor's description truncation is intentionally left lossy — a stash marker (~65 chars) against 256/160-char description limits would be 25–40% overhead. Array truncation is the high-value case and is covered here.Related Issue
no-issue: internal tokenless stash feature, part 2 of 3 (part 1 merged in #1255).
Type of Change
Scope
tokenless(tokenless)Checklist
tokenless:cargo clippy --workspace --all-targets -- -D warningsandcargo fmt --all --checkpassCargo.lock)Testing
End-to-end (shipped binary, separate processes — proves the cross-process SQLite model for the fork+exec hook path):
🤖 Generated with Claude Code