Назад към всички

search-cluster

// Aggregated search aggregator using Google CSE, GNews RSS, Wikipedia, Reddit, and Scrapling.

$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
namesearch-cluster
descriptionAggregated search aggregator using Google CSE, GNews RSS, Wikipedia, Reddit, and Scrapling.

Search Cluster (Industrial Standard v3.1)

A multi-provider search aggregator designed for high-availability and security.

Installation

The scrapling provider requires a dedicated virtual environment.

  1. Create a venv: python3 -m venv venv/scrapling
  2. Install scrapling: venv/scrapling/bin/pip install scrapling
  3. Provide the path to the venv binary in SCRAPLING_PYTHON_PATH.

Security Posture

  • Subprocess Isolation: Query inputs are passed as arguments to stealth_fetch.py.
  • Strict TLS: Mandatory SSL verification on all providers.
  • Sanitization: Integrated native internal scrubber (Path Neutral).

Requirements and Environment

Declare these variables in your environment or vault:

VariableRequirementDescription
GOOGLE_API_KEYOptionalAPI Key for Google Custom Search.
GOOGLE_CSE_IDOptionalSearch Engine ID for Google CSE.
SCRAPLING_PYTHON_PATHOptionalPath to the scrapling venv python binary.
REDIS_HOSTOptionalHost for result caching.
REDIS_PORTOptionalPort for result caching (Default: 6379).
SEARCH_USER_AGENTOptionalCustom User-Agent string.

Providers

  • google: Official Google Custom Search.
  • wiki: Wikipedia OpenSearch API.
  • reddit: Reddit JSON search API.
  • gnews: Google News RSS aggregator.
  • scrapling: Headless stealth scraping (via DuckDuckGo).

Included Scripts

  • scripts/search-cluster.py: Main entry point.
  • scripts/stealth_fetch.py: Scrapling fetcher (REQUIRED for scrapling provider).

Workflow

  1. Execute: scripts/search-cluster.py all "<query>"
  2. Output is structured JSON with source, title, link, and sanitized snippet.