Skip to content

feat: ANCS-inspired importance scoring, contradiction detection, and benchmarks#18

Merged
SimplyLiz merged 6 commits intodevelopfrom
feature/ancs-inspired-improvements
Mar 20, 2026
Merged

feat: ANCS-inspired importance scoring, contradiction detection, and benchmarks#18
SimplyLiz merged 6 commits intodevelopfrom
feature/ancs-inspired-improvements

Conversation

@SimplyLiz
Copy link
Owner

Summary

  • Importance scoring (importanceScoring: true) — preserves high-value messages outside the recency window based on forward-reference density, decision content signals, and recency bonus
  • Contradiction detection (contradictionDetection: true) — identifies later messages that correct/override earlier ones via IDF-weighted Sørensen-Dice topic overlap + correction signal patterns. Superseded messages are compressed with provenance annotations
  • Language-agnostic similarity — replaced hardcoded English stopword list with smoothed IDF weighting (log(1+N/df)), falling back to unweighted Dice for < 3 messages
  • ANCS benchmark section — new iterativeDesign scenario with architectural corrections, comparing baseline vs importance vs contradiction vs combined across 3 scenarios
  • Docs — updated API reference, compression pipeline docs, and CHANGELOG

Benchmark results (ANCS section)

Scenario Baseline +Importance +Contradiction Combined Imp. Preserved Contradicted
Deep conversation 2.37x 2.37x 2.37x 2.37x 0 0
Agentic coding 1.47x 1.24x 1.47x 1.24x 4 0
Iterative design 1.62x 1.26x 1.62x 1.26x 6 2

All existing benchmarks unchanged (features are opt-in). 540 tests pass.

Test plan

  • npm test — 540 tests pass (28 new: contradiction, importance, ANCS integration)
  • npm run bench — all scenarios PASS round-trip, ANCS section shows expected results
  • npm run lint && npm run format:check — clean

Add two ANCS-inspired features as opt-in options:

- importanceScoring: scores messages by forward-reference density,
  decision/correction content, and recency. High-importance messages
  are preserved outside the recency window. forceConverge truncates
  low-importance messages first.

- contradictionDetection: detects later messages that correct earlier
  ones (via topic overlap + correction signal patterns). Superseded
  messages are compressed with a provenance annotation linking to
  the correction.

Both features are off by default — zero impact on existing behavior.
28 new tests (540 total), zero TS errors.
- CLAUDE.md: add importance and contradiction modules to architecture
- CHANGELOG.md: add [Unreleased] section with both features
- api-reference.md: add 4 new CompressOptions, 2 new CompressResult
  stats, new exports section for importance/contradiction
- compression-pipeline.md: add importance + contradiction to
  classification order, add contradiction output format
- Add iterative design scenario with architectural corrections to
  exercise contradiction detection and importance scoring
- Add ANCS Features benchmark section comparing baseline vs importance
  vs contradiction vs combined, with round-trip verification
- Add AncsResult type, regression comparison, and doc generation
- Replace hardcoded English stopword list with IDF-weighted filtering
  (language-agnostic, adapts to message content)
- Switch from Jaccard to Sørensen-Dice similarity (better sensitivity
  for short-document topic overlap)
- Use smoothed IDF log(1+N/df) with fallback to unweighted Dice for
  < 3 documents
function extractMessageEntities(content: string): Set<string> {
const entities = new Set<string>();
for (const re of [CAMEL_RE, PASCAL_RE, SNAKE_RE, VOWELLESS_RE, FILE_REF_RE]) {
const matches = content.match(re);

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on
library input
may run slow on strings with many repetitions of '!'.
This
regular expression
that depends on library input may run slow on strings with many repetitions of '!'.
- Fix unused `_` binding in importance test (use `.values()` iterator)
- Fix stale JSDoc referencing BM25 when formula is smoothed IDF
- Fix API docs referencing Jaccard when similarity is IDF-weighted Dice
- Add camelCase/PascalCase/snake_case extraction to contradiction topic
  words — these identifiers carry the most topic signal
- Document importanceScoring + tokenBudget interaction in API reference
@SimplyLiz SimplyLiz merged commit 11cabc3 into develop Mar 20, 2026
10 of 11 checks passed
@SimplyLiz SimplyLiz deleted the feature/ancs-inspired-improvements branch March 20, 2026 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant