Skip to content

feat(hook): auto-capture code areas the agent edits#261

Open
pszymkowiak wants to merge 1 commit into
developfrom
feat/code-areas-auto-capture
Open

feat(hook): auto-capture code areas the agent edits#261
pszymkowiak wants to merge 1 commit into
developfrom
feat/code-areas-auto-capture

Conversation

@pszymkowiak
Copy link
Copy Markdown
Contributor

Summary

Closes #196.

Adds a hook-driven, no-MCP equivalent of Context-Engine's
record_code_area: every time Claude Code / Codex / Gemini / Copilot
calls Edit / Write / MultiEdit / NotebookEdit, the existing
PostToolUse hook (icm hook post) extracts tool_input.file_path and
upserts a row in a new code_areas table. Same (project, file_path)
increments touch_count instead of duplicating, so the table grows in
file count, not edit count.

What's new

  • Schema: new code_areas(id, project, file_path, description, session_id, tool_name, touch_count, first_touched_at, last_touched_at) table with UNIQUE(project, file_path) constraint
    driving the upsert.
  • Store API:
    • SqliteStore::upsert_code_area(project, file_path, description, session_id, tool_name) — ON CONFLICT bumps touch_count,
      refreshes last_touched_at / session_id / tool_name, and only
      overwrites description when the caller passes Some (preserves
      the most recent meaningful hint via COALESCE).
    • SqliteStore::list_code_areas(project, in_file, since, limit)
      filter by exact or suffix path; ordered by last_touched_at DESC.
    • SqliteStore::code_area_count().
  • Hook integration: extract_tool_input_file_path() covers three
    shapes seen across Claude Code 1.x / 2.x, Codex, and Gemini
    (tool_input.file_path, top-level file_path,
    tool_input.arguments.file_path). cmd_hook_post calls
    upsert_code_area before the extract counter — independent of
    the throttle, errors swallowed so telemetry never blocks the hook.
  • CLI: new icm code-areas command with --in-file, --project,
    --since (ISO-8601), --limit, --format {table,json}.

Why no MCP

Per the issue thread: tying this to the PostToolUse hook gives 100%
coverage
across every AI tool already icm init'd. No new MCP tool
to expose, no need for the agent to comply via system prompt. The
existing hook chain already runs after every tool call.

Notes

  • description is None in the MVP. Once feat: LLM-summarized wake-up briefing (auto-detect invoker, cheap defaults, configurable) #165 (LLM-summarized
    briefing) ships, the same provider infrastructure can feed a diff
    summary into description opt-in.
  • No new dependencies. Hook + transcript + sqlite plumbing was already
    in place — the patch reuses all of it.
  • Source MCP tool (Context-Engine-AI/Context-Engine) is
    source-available proprietary; ICM ships an Apache-2.0 equivalent
    built from scratch.

Tests

  • 7 new unit tests in icm-store (upsert idempotency, touch_count
    increment, description COALESCE behaviour, project + path-suffix
    filters, since filter, last_touched_at DESC ordering, count).
  • 513 tests pass across the workspace, cargo clippy --workspace --all-targets -- -D warnings clean.
  • Manual smoke against the release binary:
    • 3 hook payloads (Edit / Write / MultiEdit) → 2 unique paths
      captured.
    • auth.rs re-touched → touch_count = 2.
    • Bash payload → correctly ignored (no code-area row).
    • --in-file filter and --format json both verified.

Sample output

$ icm code-areas
Project              Hits   Last touched         Path
--------------------------------------------------------------------------------
icm                  2      2026-05-31 09:26:47  src/auth.rs
icm                  1      2026-05-31 09:26:47  src/login.rs

$ icm code-areas --in-file src/auth.rs --format json
[
  {
    "id": 1,
    "project": "icm",
    "file_path": "src/auth.rs",
    "description": null,
    "session_id": null,
    "tool_name": "Write",
    "touch_count": 2,
    "first_touched_at": "2026-05-31T09:26:47.203993069+00:00",
    "last_touched_at": "2026-05-31T09:26:47.211044445+00:00"
  }
]

Test plan

  • cargo build --workspace
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo test --workspace (513 passed)
  • Manual hook smoke (3 tool variants + 1 non-edit)
  • icm code-areas end-to-end with --in-file, --format json

Closes #196.

Adds a hook-driven, no-MCP equivalent of Context-Engine's
`record_code_area` tool: every time Claude Code / Codex / Gemini /
Copilot calls `Edit` / `Write` / `MultiEdit` / `NotebookEdit`, the
already-installed PostToolUse hook (`icm hook post`) extracts
`tool_input.file_path` and upserts a row in a new `code_areas` table.

Same `(project, file_path)` increments `touch_count` instead of
duplicating, so the table grows in *file count*, not edit count.

### What's new

- New table `code_areas(id, project, file_path, description,
  session_id, tool_name, touch_count, first_touched_at,
  last_touched_at)` with a `UNIQUE(project, file_path)` constraint
  driving the upsert.
- `SqliteStore::upsert_code_area` — ON CONFLICT bumps `touch_count`,
  refreshes `last_touched_at`/`session_id`/`tool_name`, and only
  overwrites `description` when the caller passes `Some` (preserves
  the most recent meaningful hint).
- `SqliteStore::list_code_areas` — filter by `project`, exact or
  suffix `file_path`, `since` timestamp; ordered by
  `last_touched_at DESC`.
- `extract_tool_input_file_path()` in `main.rs` covers the three
  shapes we've seen across Claude Code 1.x / 2.x, Codex, and Gemini
  (`tool_input.file_path`, top-level `file_path`,
  `tool_input.arguments.file_path`).
- `cmd_hook_post` calls `upsert_code_area` for matching tool names
  **before** the extract counter — independent of the throttle, never
  blocking the hot path, errors swallowed (telemetry must never fail
  the hook).
- New `icm code-areas` CLI command with `--in-file`, `--project`,
  `--since`, `--limit`, `--format {table,json}`.

### Notes

- `description` is `None` in the MVP. Once #165 (LLM-summarized
  briefing) lands, the same provider infrastructure can feed a diff
  summary into `description` opt-in.
- No new dependencies. Hook + transcript + sqlite plumbing was
  already in place; the patch reuses all of it.
- Source MCP tool (`Context-Engine-AI/Context-Engine`) is
  source-available proprietary; ICM ships an Apache-2.0 equivalent
  from scratch.

### Tests

- 7 new unit tests in `icm-store` (upsert idempotency, touch_count
  increment, description preservation/overwrite, project + path-suffix
  filters, `since` filter, ordering, count).
- 513 tests pass across the workspace, `cargo clippy --workspace
  --all-targets -- -D warnings` clean.
- Manual smoke against the release binary: 3 hook payloads (`Edit`,
  `Write`, `MultiEdit`) → `code-areas` reports 2 unique paths,
  `auth.rs` has touch_count=2, `Bash` payload correctly ignored,
  `--in-file` and `--format json` both work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant