Skip to content

[FEATURE] Prompt Compression & Optimization — Reduce token usage while maintaining quality #471

Description

@gelluisaac

Description

Implement prompt compression techniques to reduce token usage and costs without
sacrificing response quality. Includes LLMLingua-style compression and prompt optimization.

Scope

Build prompt optimization layer that minimizes tokens while preserving meaning.

Files to Touch/Create

  • astroml/llm/optimization/prompts/__init__.py
  • astroml/llm/optimization/prompts/compressor.py — Prompt compression
  • astroml/llm/optimization/prompts/optimizer.py — Prompt optimization
  • astroml/llm/optimization/prompts/selector.py — Prompt selection
  • astroml/llm/optimization/prompts/ab_test.py — A/B testing framework
  • astroml/llm/optimization/prompts/analytics.py — Performance analytics

Compression Techniques

  1. LLMLingua: Learned token-level compression
  2. Selective Context: Remove redundant context
  3. Prompt Merging: Combine similar prompts
  4. Dynamic Shortening: Adapt length to query complexity

Implementation Details

  • Two-stage compression: coarse then fine-grained
  • Quality-aware compression (don't compress if quality drops)
  • Compression ratio targeting (30-50% reduction)
  • Per-prompt-type compression profiles
  • A/B testing: compressed vs full prompts

Acceptance Criteria

  • Token reduction >30% with <2% quality loss
  • Compression runs in <100ms
  • Cost savings >30% on affected prompts
  • Quality monitoring prevents over-compression
  • Automatic fallback to uncompressed on quality drop
  • Per-model compression profiles

Metrics

  • Compression ratio
  • Quality delta (perplexity, human evaluation)
  • Cost savings
  • Latency change

Labels

enhancement, llm, optimization, cost

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions