Назад към всички

mcp-local-rag

// Provides score interpretation (< 0.3 good, > 0.5 skip), query optimization, and source naming for query_documents, ingest_file, ingest_data tools. Use this skill when working with RAG, searching documents, ingesting files, saving web content, or handling PDF, HTML, DOCX, TXT, Markdown.

$ git log --oneline --stat
stars:149
forks:28
updated:February 24, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
namemcp-local-rag
descriptionIngest, search, list, update, or delete content in a local mcp-local-rag index when the user is working with local documents or pasted/fetched HTML, Markdown, or text. Use this skill to choose the right MCP tool or `npx mcp-local-rag` CLI command, formulate effective queries, interpret search scores, and manage source metadata.

MCP Local RAG Skills

Tools

MCP ToolCLI EquivalentUse When
ingest_filenpx mcp-local-rag ingest <path>Local files (PDF, DOCX, TXT, MD). CLI for bulk/directory.
ingest_dataRaw content (HTML, text) with source URL
query_documentsnpx mcp-local-rag query <text>Semantic + keyword hybrid search
delete_filenpx mcp-local-rag delete <path>Remove ingested content
list_filesnpx mcp-local-rag listFile ingestion status
statusnpx mcp-local-rag statusDatabase stats

Search: Core Rules

Hybrid search combines vector (semantic) and keyword (BM25).

Score Interpretation

Lower = better match. Use this to filter noise.

ScoreAction
< 0.3Use directly
0.3-0.5Include if mentions same concept/entity
0.5-0.7Include only if directly relevant to the question
> 0.7Skip unless no better results

Limit Selection

IntentLimit
Specific answer (function, error)5
General understanding10
Comprehensive survey20

Query Formulation

SituationWhy TransformAction
Specific term mentionedKeyword search needs exact matchKEEP term
Vague queryVector search needs semantic signalADD context
Error stack or code blockLong text dilutes relevanceEXTRACT core keywords
Multiple distinct topicsSingle query conflates resultsSPLIT queries
Few/poor resultsTerm mismatchEXPAND (see below)

Query Expansion

When results are few or all score > 0.5, expand query terms:

  • Keep original term first, add 2-4 variants
  • Types: synonyms, abbreviations, related terms, word forms
  • Example: "config""config configuration settings configure"

Avoid over-expansion (causes topic drift).

Result Selection

When to include vs skip—based on answer quality, not just score.

INCLUDE if:

  • Directly answers the question
  • Provides necessary context
  • Score < 0.5

SKIP if:

  • Same keyword, unrelated context
  • Score > 0.7
  • Mentions term without explanation

fileTitle

Each result includes fileTitle (document title extracted from content). Null when extraction fails.

UseHow
Disambiguate chunksUse fileTitle to identify which document the chunk belongs to
Group related chunksSame fileTitle = same document context
Deprioritize mismatchesfileTitle unrelated to query AND score > 0.5 → rank lower

Ingestion

ingest_file

ingest_file({ filePath: "/absolute/path/to/document.pdf" })

ingest_data

ingest_data({
  content: "<html>...</html>",
  metadata: { source: "https://example.com/page", format: "html" }
})

Format selection — match the data you have:

  • HTML string → format: "html"
  • Markdown string → format: "markdown"
  • Other → format: "text"

Source format:

  • Web page → Use URL: https://example.com/page
  • Other content → Use scheme: {type}://{date} or {type}://{date}/{detail} where {type} is a short identifier for the content origin (e.g., clipboard, chat, note, meeting)

HTML source options:

  • Static page → HTTP fetch
  • SPA/JS-rendered → Browser/web tool with DOM rendering
  • Auth required → Manual paste

If HTTP fetch returns empty or minimal content, retry with a browser/web tool.

Source URLs are normalized: query strings and fragments are stripped. See html-ingestion.md for cases where this matters.

Re-ingest same source to update. Use same source in delete_file to remove.

CLI commands

CLI subcommands mirror MCP tools. Useful for bulk operations, scripting, and environments without MCP.

  • query, list, status, delete output JSON to stdout
  • ingest outputs progress to stderr
  • Use --help on any command for options
  • See cli-reference.md for options and config matching

References

For edge cases and examples: