RAG
// Build, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.
$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nameRAG
slugrag
descriptionBuild, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.
When to Use
User wants to implement, improve, or troubleshoot Retrieval-Augmented Generation systems.
Quick Reference
| Topic | File |
|---|---|
| Pipeline components & architecture | architecture.md |
| Implementation patterns & code | implementation.md |
| Evaluation metrics & debugging | evaluation.md |
| Security & compliance | security.md |
Core Capabilities
- Architecture design — Select embedding models, vector DBs, and chunking strategies based on requirements
- Implementation — Write ingestion pipelines, query handlers, and update logic
- Retrieval optimization — Tune top-k, reranking, hybrid search parameters
- Evaluation — Build test datasets, measure recall/precision, diagnose failures
- Production ops — Monitor quality drift, set up alerts, debug degradation
- Security — PII detection, access control, compliance requirements
Decision Checklist
Before recommending architecture, ask:
- What document types and volume?
- Latency requirements (real-time chat vs batch)?
- Update frequency (how often do docs change)?
- Access control needs (who can see what)?
- Compliance constraints (GDPR, HIPAA, SOC2)?
- Budget (managed vs self-hosted, embedding costs)?
Critical Rules
- Never skip access control — Filter at retrieval time, not after
- Always overlap chunks — 10-20% prevents context loss at boundaries
- Evaluate before optimizing — Build eval dataset first, then tune
- Same embedding model — Query and documents must use identical model
- Monitor similarity scores — Dropping averages signal drift or issues
- Plan for deletion — GDPR erasure requires re-embedding capability
Common Failure Patterns
| Symptom | Likely Cause | Fix |
|---|---|---|
| Wrong docs retrieved | Query too vague, poor chunks | Query expansion, smaller chunks |
| Relevant doc missed | Not indexed, low similarity | Check ingestion, hybrid search |
| Hallucinated answers | Context too short | Increase top-k, better reranking |
| Slow responses | Large chunks, no caching | Optimize chunk size, cache embeddings |
| Inconsistent results | Non-deterministic reranking | Set seeds, use stable sorting |