Назад към всички

prompt-compression

// Token-efficient prompt compression techniques for cost optimization

$ git log --oneline --stat
stars:384
forks:73
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nameprompt-compression
descriptionToken-efficient prompt compression techniques for cost optimization
allowed-toolsRead,Write,Edit,Bash,Glob,Grep

Prompt Compression Skill

Capabilities

  • Implement token-efficient prompt compression
  • Design context pruning strategies
  • Configure selective context inclusion
  • Implement LLMLingua-style compression
  • Design summary-based compression
  • Create compression quality metrics

Target Processes

  • cost-optimization-llm
  • agent-performance-optimization

Implementation Details

Compression Techniques

  1. LLMLingua: Token-level compression
  2. Summary Compression: LLM-based summarization
  3. Selective Context: Relevant section extraction
  4. Token Pruning: Remove low-importance tokens
  5. Document Filtering: Pre-retrieval filtering

Configuration Options

  • Compression ratio targets
  • Quality threshold settings
  • Token budget constraints
  • Compression model selection
  • Evaluation metrics

Best Practices

  • Monitor quality vs compression tradeoff
  • Test with representative prompts
  • Set appropriate compression ratios
  • Validate compressed prompt quality
  • Track cost savings

Dependencies

  • llmlingua (optional)
  • tiktoken
  • transformers