Назад към всички

RAG

// Build, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.

$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nameRAG
slugrag
descriptionBuild, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.

When to Use

User wants to implement, improve, or troubleshoot Retrieval-Augmented Generation systems.

Quick Reference

TopicFile
Pipeline components & architecturearchitecture.md
Implementation patterns & codeimplementation.md
Evaluation metrics & debuggingevaluation.md
Security & compliancesecurity.md

Core Capabilities

  1. Architecture design — Select embedding models, vector DBs, and chunking strategies based on requirements
  2. Implementation — Write ingestion pipelines, query handlers, and update logic
  3. Retrieval optimization — Tune top-k, reranking, hybrid search parameters
  4. Evaluation — Build test datasets, measure recall/precision, diagnose failures
  5. Production ops — Monitor quality drift, set up alerts, debug degradation
  6. Security — PII detection, access control, compliance requirements

Decision Checklist

Before recommending architecture, ask:

  • What document types and volume?
  • Latency requirements (real-time chat vs batch)?
  • Update frequency (how often do docs change)?
  • Access control needs (who can see what)?
  • Compliance constraints (GDPR, HIPAA, SOC2)?
  • Budget (managed vs self-hosted, embedding costs)?

Critical Rules

  • Never skip access control — Filter at retrieval time, not after
  • Always overlap chunks — 10-20% prevents context loss at boundaries
  • Evaluate before optimizing — Build eval dataset first, then tune
  • Same embedding model — Query and documents must use identical model
  • Monitor similarity scores — Dropping averages signal drift or issues
  • Plan for deletion — GDPR erasure requires re-embedding capability

Common Failure Patterns

SymptomLikely CauseFix
Wrong docs retrievedQuery too vague, poor chunksQuery expansion, smaller chunks
Relevant doc missedNot indexed, low similarityCheck ingestion, hybrid search
Hallucinated answersContext too shortIncrease top-k, better reranking
Slow responsesLarge chunks, no cachingOptimize chunk size, cache embeddings
Inconsistent resultsNon-deterministic rerankingSet seeds, use stable sorting