Назад към всички

deep-scout

// Multi-stage deep intelligence pipeline (Search → Filter → Fetch → Synthesize). Turns a query into a structured research report with full source citations.

$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
namedeep-scout
descriptionMulti-stage deep intelligence pipeline (Search → Filter → Fetch → Synthesize). Turns a query into a structured research report with full source citations.
version0.1.4
metadata[object Object]

deep-scout

Multi-stage deep intelligence pipeline (Search → Filter → Fetch → Synthesize).

🛠️ Installation

1. Ask OpenClaw (Recommended)

Tell OpenClaw: "Install the deep-scout skill." The agent will handle the installation and configuration automatically.

2. Manual Installation (CLI)

If you prefer the terminal, run:

clawhub install deep-scout

🚀 Usage

/deep-scout "Your research question" [--depth 5] [--freshness pw] [--country US] [--style report]

Options

FlagDefaultDescription
--depth N5Number of URLs to fully fetch (1–10)
--freshnesspwpd=past day, pw=past week, pm=past month, py=past year
--countryUS2-letter country code for Brave search
--languageen2-letter language code
--search-count8Total results to collect before filtering
--min-score4Minimum relevance score to keep (0–10)
--stylereportreport | comparison | bullets | timeline
--dimensionsautoComparison dimensions (comma-separated, for --style comparison)
--output FILEstdoutWrite report to file
--no-browserDisable browser fallback
--no-firecrawlDisable Firecrawl fallback

🛠️ Pipeline — Agent Loop Instructions

When this skill is invoked, execute the following four-stage pipeline:


Stage 1: SEARCH

Call web_search with:

query: <user query>
count: <search_count>
country: <country>
search_lang: <language>
freshness: <freshness>

Collect: title, url, snippet for each result.
If fewer than 3 results returned, retry with freshness: "py" (relaxed).


Stage 2: FILTER

Load prompts/filter.txt. Replace template vars:

  • {{query}} → the user's query
  • {{freshness}} → freshness param
  • {{min_score}} → min_score param
  • {{results_json}} → JSON array of search results

Call the LLM with this prompt. Parse the returned JSON array.
Keep only results where keep: true. Sort by score descending.
Take top depth URLs as the fetch list.

Deduplication: Max 2 results per root domain (already handled in filter prompt).


Stage 3: FETCH (Tiered Escalation)

For each URL in the filtered list:

Tier 1 — web_fetch (fast):

Call web_fetch(url)
If content length >= 200 chars → accept, trim to max_chars_per_source

Tier 2 — Firecrawl (deep/JS):

If Tier 1 fails or returns < 200 chars:
  Run: scripts/firecrawl-wrap.sh <url> <max_chars>
  If output != "FIRECRAWL_UNAVAILABLE" and != "FIRECRAWL_EMPTY" → accept

Tier 3 — Browser (last resort):

If Tier 2 fails:
  Call browser(action="open", url=url)
  Call browser(action="snapshot")
  Load prompts/browser-extract.txt, substitute {{query}} and {{max_chars_per_source}}
  Call LLM with snapshot content + extraction prompt
  If output != "FETCH_FAILED:..." → accept

If all tiers fail: Use the original snippet from Stage 1 search results. Mark as [snippet only].

Store: { url: extracted_content } dict.


Stage 4: SYNTHESIZE

Choose prompt template based on --style:

  • report / bullets / timelineprompts/synthesize-report.txt
  • comparisonprompts/synthesize-comparison.txt

Replace template vars:

  • {{query}} → user query
  • {{today}} → current date (YYYY-MM-DD)
  • {{language}} → language param
  • {{source_count}} → number of successfully fetched sources
  • {{dimensions_or_auto}} → dimensions param (or "auto")
  • {{fetched_content_blocks}} → build as:
    [Source 1] (url1)
    <content>
    ---
    [Source 2] (url2)
    <content>
    

Call LLM with the filled prompt. The output is the final report.

If --output FILE is set, write the report to that file. Otherwise, print to the channel.


⚙️ Configuration

Defaults are in config.yaml. Override via CLI flags above.


📂 Project Structure

skills/deep-scout/
├── SKILL.md                     ← This file (agent instructions)
├── config.yaml                  ← Default parameter values
├── prompts/
│   ├── filter.txt               ← Stage 2: relevance scoring prompt
│   ├── synthesize-report.txt    ← Stage 4: report/bullets/timeline synthesis
│   ├── synthesize-comparison.txt← Stage 4: comparison table synthesis
│   └── browser-extract.txt      ← Stage 3: browser snapshot extraction
├── scripts/
│   ├── run.sh                   ← CLI entrypoint (emits pipeline actions)
│   └── firecrawl-wrap.sh        ← Firecrawl CLI wrapper with fallback handling
└── examples/
    └── openclaw-acquisition.md  ← Example output: OpenClaw M&A intelligence

🔧 Error Handling

ScenarioHandling
All fetch attempts failUse snippet from Stage 1; mark [snippet only]
Search returns 0 resultsRetry with freshness: py; error if still 0
Firecrawl not installedfirecrawl-wrap.sh outputs FIRECRAWL_UNAVAILABLE, skip silently
Browser tool unavailableSkip Tier 3; proceed with available content
LLM synthesis exceeds contextTrim sources proportionally, prioritize high-score sources
Rate limit on Brave APIWait 2s, retry once

📋 Example Outputs

See examples/openclaw-acquisition.md for a full sample report.


Deep Scout v0.1.0 · OpenClaw Skills · clawhub: deep-scout