Skip to content

feat(tokenless): integrate stash into response compression + retrieve CLI#1270

Merged
Forrest-ly merged 1 commit into
alibaba:mainfrom
shiloong:feature/tokenless/ccr-integration
Jul 3, 2026
Merged

feat(tokenless): integrate stash into response compression + retrieve CLI#1270
Forrest-ly merged 1 commit into
alibaba:mainfrom
shiloong:feature/tokenless/ccr-integration

Conversation

@shiloong

@shiloong shiloong commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Description

Wires the reversible stash from tokenless-ccr (merged in #1255) into the response compression path so dropped array items become retrievable instead of permanently lost. This is part 2 of the stash feature (part 1 = the crate; part 3 = tests + docs, opened as a stacked PR on top of this one).

ResponseCompressor (crates/tokenless-schema):

  • Add stash_store: Option<Arc<dyn StashStore>>. None (the default) preserves the original lossy truncation path — zero core-path impact.
  • with_stash_store() builder; compress_array stashes the raw dropped tail and embeds a <<tokenless:KEY>> marker. On stash absence/failure the marker degrades to the plain <... N more items truncated> form (fail-open).
  • Schema crate depends on tokenless-ccr with default-features = false — pulls only the trait + key + marker, no rusqlite (the CLI supplies the concrete backend).

CLI (crates/tokenless-cli):

  • compress-response gains --no-stash (opt out) and --stash-db (override path). By default a SqliteStore is opened at ~/.tokenless/stash.db and injected; open_stash_store fails open to None on any error so compression proceeds without stash.
  • New Retrieve subcommand: reads the stash db, accepts either a bare 24-hex hash or a line containing a <<tokenless:HASH>> marker (extracted via tokenless_ccr::extract_hash). Retrieve is user-initiated, so failures surface as errors rather than fail-open.
  • Stash db path resolution mirrors the stats db trust model: passwd-rooted home (not $HOME), TOKENLESS_STASH_DB / --stash-db validated via the existing validate_db_path (canonicalize + starts_with home) so an attacker cannot redirect the stash to a system-critical location. Reused validate_db_path rather than duplicating.

Scope decision: SchemaCompressor's description truncation is intentionally left lossy — a stash marker (~65 chars) against 256/160-char description limits would be 25–40% overhead. Array truncation is the high-value case and is covered here.

Related Issue

no-issue: internal tokenless stash feature, part 2 of 3 (part 1 merged in #1255).

Type of Change

  • New feature (non-breaking change that adds functionality)

Scope

  • tokenless (tokenless)

Checklist

  • My code follows the project's code style
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the documentation accordingly
  • For tokenless: cargo clippy --workspace --all-targets -- -D warnings and cargo fmt --all --check pass
  • Lock files are up to date (Cargo.lock)

Testing

cd src/tokenless
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace   # 8 suites green

End-to-end (shipped binary, separate processes — proves the cross-process SQLite model for the fork+exec hook path):

echo '[1,...,200]' | tokenless compress-response --truncate-arrays-at 5
# -> [1,2,3,4,5,"<... 195 items truncated, retrieve with <<tokenless:c30c…>>"]
tokenless retrieve c30ccf5ed1125e0ed871ba8e   # separate process, same stash db
# -> [6,7,8,…,200]
tokenless compress-response --no-stash ...    # lossy path (pre-stash behavior)

🤖 Generated with Claude Code

@Forrest-ly Forrest-ly left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Code Review: feat(tokenless): integrate stash into response compression + retrieve CLI

This PR wires the reversible stash from tokenless-ccr into ResponseCompressor so truncated array items can be retrieved via a new tokenless retrieve CLI subcommand. The stash is optional and fail-open — compression proceeds without stash on any error, falling back to
the original lossy truncation marker. The implementation is well-structured: the schema crate pulls only the trait (no SQLite), and the CLI supplies the concrete backend.

Findings

  1. 🟠 Stash writes orphaned when compressed output is discarded

src/tokenless/crates/tokenless-cli/src/main.rs:454

compressor.compress(&value) (which now writes stash entries inside compress_array) runs before the no-savings check (after_tokens >= before_tokens, line 385) and the dry-run check (compression_on == false, line 398). In both cases the compressed output containing
<tokenless:HASH> markers is thrown away and the original input is emitted. The stash entries are written to SQLite but their markers are never seen by the LLM, producing unretrievable orphans.

While TTL expiry eventually cleans them up, this is unnecessary I/O on every dry-run invocation that truncates arrays.

Suggested fix: check compression_on and/or estimate token savings before calling compress(), or skip with_stash_store() when compression is disabled.


  1. 🟠 Asymmetric error handling: --stash-db rejection disables stash, TOKENLESS_STASH_DB rejection falls back to default

src/tokenless/crates/tokenless-cli/src/main.rs:313

In get_stash_db_path():

  • --stash-db fails validate_db_path → returns None (stash completely disabled)
  • TOKENLESS_STASH_DB fails the same validation → falls back to ~/.tokenless/stash.db

A typo in --stash-db (e.g. --stash-db /tmpp/stash.db) silently disables the stash with only a stderr warning — the user loses reversibility without a hard error. There is no established precedent for this "strict on flag, lenient on env var" pattern in the existing
get_db_path().

Suggested fix: either fall back to the default path on both failure paths, or surface a hard error on both.


  1. 🟡 Retrieve reuses fail-open open_stash_store() but needs fail-hard semantics

src/tokenless/crates/tokenless-cli/src/main.rs:498

open_stash_store() was designed for the compression path where silent degradation is correct. Reusing it for the user-initiated Retrieve subcommand means the user gets a generic error:

stash unavailable: no trusted home directory or cannot open stash db

…that may not match the actual failure (corrupt DB, path rejected, etc.). The eprintln diagnostics do print to stderr, but programmatic consumers (scripts checking exit code + stdout) see only the generic message.

Suggested fix: add an open_stash_store_or_err() -> Result<Arc, String> that returns the specific cause, and use it in Retrieve.


  1. 🟡 stash_dropped silently swallows stash write errors with no diagnostic

src/tokenless/crates/tokenless-schema/src/response_compressor.rs:241

stash.stash(&payload).ok() // StashError → None, no logging

Any StashError (disk full, locked DB, I/O failure) is silently converted to None. The truncation marker degrades to the non-retrievable form. A persistent backend failure (e.g. full disk) would be invisible until a user tries to retrieve and gets a miss.

Since this is library code, avoiding eprintln may be intentional — but consider returning Result from stash_dropped so the CLI layer can choose whether to log.


  1. 🟡 Retrieve accepts arbitrary strings without hash format validation

src/tokenless/crates/tokenless-cli/src/main.rs:510

let key = extract_hash(&hash).unwrap_or(hash.as_str());

When extract_hash returns None (no <tokenless:...> marker found), the raw input is passed directly to store.retrieve() without checking it's a valid 24-hex-char key. A user who mistakenly passes a file path gets:

no stashed payload for hash: /some/path

…instead of a clear format error.

Suggested fix: validate the hex format before the DB round-trip.


  1. 🔵 Stashed items preserve fields the compressor would strip (intentional, but note data-hygiene implication)

src/tokenless/crates/tokenless-schema/src/response_compressor.rs:219

The dropped slice &arr[self.truncate_arrays_at..] captures original values before compress_value runs. Kept items have drop_fields (debug/stacktrace) stripped, nulls removed, and max_depth enforced — but stashed items preserve everything.

The PR description explicitly acknowledges this ("retrieval yields the original content verbatim"), so this is a design decision. However, if drop_fields serves a data-hygiene purpose (stripping PII or internal debug info), those fields survive in the stash DB and are
returned verbatim by Retrieve. Worth a comment in the code documenting this as intentional.


Review methodology

Medium-effort review: 6 independent finder angles (line-by-line scan, removed-behavior audit, cross-file trace, reuse/simplification/efficiency, altitude, conventions) each producing up to 6 candidates, followed by 1-vote verification per candidate.

… CLI

Wire the reversible stash from tokenless-ccr into the response compression
path so dropped array items become retrievable instead of permanently lost.

ResponseCompressor:
- Add optional stash_store: Option<Arc<dyn StashStore>>. None (the default)
  preserves the original lossy truncation path — zero core-path impact.
- with_stash_store builder; compress_array stashes the raw dropped tail and
  embeds a <<tokenless:KEY>> marker. On stash absence/failure the marker
  degrades to the plain "<... N more items truncated>" form (fail-open).
- Tests cover lossy (no stash), round-trip (stash + retrieve), and the
  failing-backend fallback.

CLI:
- CompressResponse gains --no-stash (opt out) and --stash-db (override path).
  By default a SqliteStore is opened at ~/.tokenless/stash.db and injected;
  open_stash_store fails open to None on any error so compression proceeds.
- New Retrieve subcommand: reads the stash db, accepts either a bare 24-hex
  hash or a line containing a <<tokenless:HASH>> marker (extracted via
  tokenless_ccr::extract_hash). Retrieve is user-initiated, so failures
  surface as errors rather than fail-open.
- Stash db path resolution mirrors the stats db trust model: passwd-rooted
  home, TOKENLESS_STASH_DB / --stash-db validated against the trusted home
  (canonicalize + starts_with) so an attacker cannot redirect the stash to a
  system-critical location.

SchemaCompressor description truncation is intentionally left lossy: a stash
marker (~65 chars) against 256/160-char description limits would be 25-40%
overhead. Array truncation is the high-value case and is covered here.

End-to-end verified across processes: compress writes to stash.db, a separate
retrieve invocation recovers the original dropped items. fmt/clippy/test
green across the workspace.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
@shiloong shiloong force-pushed the feature/tokenless/ccr-integration branch from 2a4a781 to 3eefed4 Compare July 3, 2026 03:18
shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Expose the stash retrieval path over MCP so an MCP-connected agent can
recover truncated payloads on demand — the MCP analogue of the
`tokenless retrieve` CLI (closes the stash feature's MCP gap vs Headroom
CCR's headroom_retrieve).

- New `tokenless mcp serve` subcommand: a minimal stdio JSON-RPC server
  (hand-rolled, no MCP SDK dep — keeps the zero-runtime-dependency core
  path). Implements initialize / tools/list / tools/call per MCP
  2024-11-05; notifications get no response; unknown methods return
  -32601.
- Exposes one tool, `tokenless_retrieve` (hash -> payload text). Accepts a
  bare 24-hex hash or a line containing a `<<tokenless:KEY>>` marker
  (extracted via tokenless_ccr::extract_hash), mirroring the CLI.
- Reuses open_stash_store + the passwd-rooted stash db trust model from
  the Retrieve CLI — no duplication of path validation.
- Fail model: per-request tool errors return MCP tool results
  (isError: true) so the client keeps talking; only stdin/stdout I/O
  failure terminates the server.

6 unit tests (tool schema, dispatch, missing-arg, round-trip, marker
line, not-found) + end-to-end stdio smoke verified (compress writes stash
in one process, MCP tools/call retrieves in another).

Stacked on alibaba#1270 (compressor integration + Retrieve CLI); rebase to a
single commit once that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Add stash observability so compression stats carry how many stash writes
each compress triggered and the live stash entry count — the gauge for
tuning stash TTL/capacity and verifying the reversible path is exercised
in production.

ResponseCompressor (schema crate):
- Add an AtomicUsize stash-write counter, reset at the start of each
  compress() and incremented on every successful stash_dropped(). Expose
  stash_writes() so the CLI can read it without the schema crate
  depending on the stats crate (no new cross-crate coupling).

StatsRecord (stats crate):
- Add nullable stash_writes / stash_size columns (Option<i64>) +
  with_stash() builder. Migration extends the existing
  pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so
  old DBs gain the columns at first open; pre-stash records read back
  as None. INSERT/SELECT/row_to_record updated.

CLI:
- compress-response reads compressor.stash_writes() + stash.len() after
  compress and records them; compress-schema / compress-toon pass None
  (no stash path). record_compression_stats gains the two params.

retrieve-side hits/misses are intentionally not added here — retrieve is
not a compression operation and would need a new OperationType + schema
change; deferred until there's a retrieve-stats use case.

5 new tests (counter zero/resets/round-trip + stats round-trip +
default-None). fmt/clippy(--all-targets)/test green.

Stacked on alibaba#1270 (compressor integration); rebase to a single commit once
that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
@shiloong

shiloong commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks @Forrest-ly — findings #1, #2, #3, #5, #6 addressed in 3eefed41 (force-pushed, stacked PRs #1271/#1285/#1291 rebased on top); #4 deferred to #1291 (reasoning below). CI re-running.

1. 🟠 Orphaned stash writes when output discarded — Fixed per your "skip with_stash_store() when compression is disabled" suggestion. Config is now loaded before the stash decision, and the stash is skipped entirely when !compression_on (dry-run) — so dry-run invocations never write stash entries whose markers would be discarded. The no-savings discard edge (compression on but after_tokens >= before_tokens) is left with an inline comment: truncation almost always yields savings so this is rare, and orphaned entries are TTL-cleaned. A full double-compress (probe-without-stash → recompress-with-stash) would close the no-savings edge too but doubles the compress on the savings path; chose the minimal fix.

2. 🟠 Asymmetric --stash-db / TOKENLESS_STASH_DB handling — Fixed. --stash-db validation failure now falls back to the default path under the trusted home (with the same stderr warning as TOKENLESS_STASH_DB), instead of silently disabling the stash. Both override paths are now lenient-and-consistent, matching the get_db_path() precedent.

3. 🟡 Retrieve fail-open vs fail-hard — Fixed. Added open_stash_store_or_err() -> Result<Arc<dyn StashStore>, String> that surfaces the specific cause (no home / path rejected / cannot create dir / cannot open db); Retrieve uses it so the user sees the real reason. open_stash_store() (fail-open, for compress) now wraps it, so the two paths share one implementation.

5. 🟡 Retrieve accepts arbitrary strings without hash format validation — Fixed. Exposed tokenless_ccr::is_valid_hash (public); Retrieve validates the bare hash is 24 ASCII hex before the DB round-trip and returns a clear invalid stash hash: ... (expected 24 hex chars or a <<tokenless:HASH>> marker) error. (Applied the same to the MCP tokenless_retrieve tool in #1285 for consistency.)

6. 🔵 Stashed items preserve fields (data-hygiene note) — Fixed with an inline comment in compress_array: the dropped slice is captured pre-compress_value, so stashed items preserve drop_fields/nulls/depth verbatim by design (retrieval must be lossless); if a field must not survive in the stash DB, strip it upstream of the compressor.

4. 🟡 stash_dropped silently swallows errors — Deferred to #1291. That PR already adds an AtomicUsize stash_writes counter (successful writes) for stats; a matching stash_errors counter that the CLI logs fits there naturally (same mechanism, same crate boundary — schema exposes a counter, CLI logs). Doing it in #1291 keeps this PR's diff focused and avoids touching the counter area twice. Will follow up in #1291.

cc @samchu-zsl

shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Add stash observability so compression stats carry how many stash writes
each compress triggered and the live stash entry count — the gauge for
tuning stash TTL/capacity and verifying the reversible path is exercised
in production.

ResponseCompressor (schema crate):
- Add an AtomicUsize stash-write counter, reset at the start of each
  compress() and incremented on every successful stash_dropped(). Expose
  stash_writes() so the CLI can read it without the schema crate
  depending on the stats crate (no new cross-crate coupling).

StatsRecord (stats crate):
- Add nullable stash_writes / stash_size columns (Option<i64>) +
  with_stash() builder. Migration extends the existing
  pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so
  old DBs gain the columns at first open; pre-stash records read back
  as None. INSERT/SELECT/row_to_record updated.

CLI:
- compress-response reads compressor.stash_writes() + stash.len() after
  compress and records them; compress-schema / compress-toon pass None
  (no stash path). record_compression_stats gains the two params.

retrieve-side hits/misses are intentionally not added here — retrieve is
not a compression operation and would need a new OperationType + schema
change; deferred until there's a retrieve-stats use case.

5 new tests (counter zero/resets/round-trip + stats round-trip +
default-None). fmt/clippy(--all-targets)/test green.

Stacked on alibaba#1270 (compressor integration); rebase to a single commit once
that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>

@Forrest-ly Forrest-ly left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Forrest-ly Forrest-ly merged commit 9373cb3 into alibaba:main Jul 3, 2026
25 of 26 checks passed
shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Expose the stash retrieval path over MCP so an MCP-connected agent can
recover truncated payloads on demand — the MCP analogue of the
`tokenless retrieve` CLI (closes the stash feature's MCP gap vs Headroom
CCR's headroom_retrieve).

- New `tokenless mcp serve` subcommand: a minimal stdio JSON-RPC server
  (hand-rolled, no MCP SDK dep — keeps the zero-runtime-dependency core
  path). Implements initialize / tools/list / tools/call per MCP
  2024-11-05; notifications get no response; unknown methods return
  -32601.
- Exposes one tool, `tokenless_retrieve` (hash -> payload text). Accepts a
  bare 24-hex hash or a line containing a `<<tokenless:KEY>>` marker
  (extracted via tokenless_ccr::extract_hash), mirroring the CLI.
- Reuses open_stash_store + the passwd-rooted stash db trust model from
  the Retrieve CLI — no duplication of path validation.
- Fail model: per-request tool errors return MCP tool results
  (isError: true) so the client keeps talking; only stdin/stdout I/O
  failure terminates the server.

6 unit tests (tool schema, dispatch, missing-arg, round-trip, marker
line, not-found) + end-to-end stdio smoke verified (compress writes stash
in one process, MCP tools/call retrieves in another).

Stacked on alibaba#1270 (compressor integration + Retrieve CLI); rebase to a
single commit once that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Add stash observability so compression stats carry how many stash writes
each compress triggered and the live stash entry count — the gauge for
tuning stash TTL/capacity and verifying the reversible path is exercised
in production.

ResponseCompressor (schema crate):
- Add an AtomicUsize stash-write counter, reset at the start of each
  compress() and incremented on every successful stash_dropped(). Expose
  stash_writes() so the CLI can read it without the schema crate
  depending on the stats crate (no new cross-crate coupling).

StatsRecord (stats crate):
- Add nullable stash_writes / stash_size columns (Option<i64>) +
  with_stash() builder. Migration extends the existing
  pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so
  old DBs gain the columns at first open; pre-stash records read back
  as None. INSERT/SELECT/row_to_record updated.

CLI:
- compress-response reads compressor.stash_writes() + stash.len() after
  compress and records them; compress-schema / compress-toon pass None
  (no stash path). record_compression_stats gains the two params.

retrieve-side hits/misses are intentionally not added here — retrieve is
not a compression operation and would need a new OperationType + schema
change; deferred until there's a retrieve-stats use case.

5 new tests (counter zero/resets/round-trip + stats round-trip +
default-None). fmt/clippy(--all-targets)/test green.

Stacked on alibaba#1270 (compressor integration); rebase to a single commit once
that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
shiloong added a commit to shiloong/anolisa that referenced this pull request Jul 3, 2026
Add stash observability so compression stats carry how many stash writes
each compress triggered and the live stash entry count — the gauge for
tuning stash TTL/capacity and verifying the reversible path is exercised
in production.

ResponseCompressor (schema crate):
- Add an AtomicUsize stash-write counter, reset at the start of each
  compress() and incremented on every successful stash_dropped(). Expose
  stash_writes() so the CLI can read it without the schema crate
  depending on the stats crate (no new cross-crate coupling).

StatsRecord (stats crate):
- Add nullable stash_writes / stash_size columns (Option<i64>) +
  with_stash() builder. Migration extends the existing
  pragma_table_info + ALTER TABLE pattern (now (col, type) tuples) so
  old DBs gain the columns at first open; pre-stash records read back
  as None. INSERT/SELECT/row_to_record updated.

CLI:
- compress-response reads compressor.stash_writes() + stash.len() after
  compress and records them; compress-schema / compress-toon pass None
  (no stash path). record_compression_stats gains the two params.

retrieve-side hits/misses are intentionally not added here — retrieve is
not a compression operation and would need a new OperationType + schema
change; deferred until there's a retrieve-stats use case.

5 new tests (counter zero/resets/round-trip + stats round-trip +
default-None). fmt/clippy(--all-targets)/test green.

Stacked on alibaba#1270 (compressor integration); rebase to a single commit once
that lands.

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:tokenless src/tokenless/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants