Skip to content

feat: IC-Former-inspired improvements — base64 classifier, overhead metric#23

Merged
SimplyLiz merged 4 commits intodevelopfrom
feature/ic-former-inspired-improvements
Mar 22, 2026
Merged

feat: IC-Former-inspired improvements — base64 classifier, overhead metric#23
SimplyLiz merged 4 commits intodevelopfrom
feature/ic-former-inspired-improvements

Conversation

@SimplyLiz
Copy link
Owner

Summary

  • Base64 classification fix: Messages containing base64 blobs (40+ chars) were falling through the classifier as T2/T3 and getting summarized, destroying opaque data. Added base64_content pattern to FORCE_T0_PATTERNS and HARD_T0_REASONS — these are now preserved verbatim.
  • Compression overhead ratio metric: computeOverheadRatio() measures compress() wall-clock time vs estimated LLM inference cost. Displayed as OvhdR column in quality bench output.
  • High-entropy quality bench scenario: New scenario with hex dumps, UUID arrays, base64 blobs, and mixed entropy+prose. 4/4 probes pass. Quality baseline saved for v1.4.0.
  • Adjacency scoring was implemented, A/B tested, and removed — zero measurable effect across all scenarios. Commit history preserved for reference.

Verification

  • 671 tests pass, zero regressions against v1.3.0 baseline
  • npm run lint && npm run format:check clean
  • npm run bench:quality --check passes
  • New v1.4.0 quality baseline saved with high-entropy scenario

Test plan

  • npm test — all 671 tests pass
  • npm run bench:quality — high-entropy probes 4/4
  • npm run bench:quality -- --check — no regressions vs v1.3.0
  • npm run lint && npm run format:check — clean
  • A/B tested adjacency scoring, confirmed no effect, removed

…scoring, overhead metric

- Fix base64 classification gap: add FORCE_T0_PATTERNS entry for 40+ char
  base64 strings, add to HARD_T0_REASONS. Hex/UUID probes already pass,
  base64 probe now passes too.
- Sentence adjacency scoring: boost sentences sharing entities with
  neighbors (+2 one side, +3 both) to improve topical coherence in
  summaries. Uses exported extractMessageEntities from importance.ts.
- Compression overhead ratio metric: computeOverheadRatio() measures
  compress() wall-clock time vs estimated LLM inference time. Displayed
  as OvhdR column in quality bench output.
- High-entropy quality bench scenario with hex dump, UUID array, base64
  blob, and mixed entropy+prose messages (4/4 probes pass).
- Unit tests for base64/hex/UUID classification and compress preservation,
  adjacency scoring behavior, and documented known limitations (UUID gap,
  camelCase false-positive on base64 pattern).
v1.4.0 baseline with high-entropy content scenario. Zero regressions
vs v1.3.0 across all 13 shared scenarios. New scenario: High-entropy
content at 1.35x ratio, 100% entity retention, 4/4 probes.
Settings bar:
- depth dropdown (gentle/moderate/aggressive/auto)
- relevance toggle + threshold input
- flow, importance, contradiction, coreference, clustering toggles
- budget strategy dropdown (binary-search/tiered, visible when budget on)
- visual divider between v1 and v2 controls

Stats bar:
- quality_score, entity_retention, structural_integrity chips
- messages_relevance_dropped, importance_preserved, contradicted chips
- color-coded: green >=90%, amber >=70%, red <70%

Examples:
- "Q&A + corrections" — demonstrates flow + contradiction detection
- "Topic-scattered" — 3 interleaved topics for clustering demo

Help panel: V2 Features section with all new options explained
A/B testing showed adjacency scoring (+2/+3 boost for entity-linked
neighbor sentences) produces identical results across all quality
bench scenarios. The summarizer budget is wide enough that sentence
selection pressure never triggers the tiebreaker. Removing to avoid
dead complexity.
@SimplyLiz SimplyLiz merged commit cd9165f into develop Mar 22, 2026
4 of 7 checks passed
@SimplyLiz SimplyLiz deleted the feature/ic-former-inspired-improvements branch March 22, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant