Назад към всички

moa

// Mixture of Agents: Make 3 frontier models argue, then synthesize their best insights into one superior answer. ~$0.03/query.

$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
namemoa
descriptionMixture of Agents: Make 3 frontier models argue, then synthesize their best insights into one superior answer. ~$0.03/query.
authorJohn Scianna (@Scianna)
version1.2.0
requiresOPENROUTER_API_KEY
cost~$0.03 per query (paid tier)

Mixture of Agents (MoA)

TL;DR: Make 3 AI models argue with each other. Get an answer better than any single model. Cost: ~$0.03.

Two Usage Modes

A. Standalone CLI (Node.js)

export OPENROUTER_API_KEY="your-key"
node scripts/moa.js "Your complex question"

B. OpenClaw Skill (Agent-orchestrated)

# Install
clawhub install moa

# Or copy to ~/clawd/skills/moa/

The agent can then invoke MoA for complex analysis tasks.


Origin Story

The concept of "Mixture of Agents" comes from research showing LLMs can improve each other's outputs through collaboration. I built this for VC deal analysis—when evaluating startups, you want multiple perspectives, not one model's opinion.

The journey:

  1. Started with 5 free OpenRouter models (Llama, Gemini, Mistral, Qwen, Nemotron)
  2. Rate limits killed me at 2am during peak hours
  3. Switched to 3 paid frontier specialists
  4. Result: ~$0.03/query, answers better than any single model

When to Use

  • Complex analysis — due diligence, market research, technical evaluation
  • Brainstorming — get diverse ideas, synthesize the best
  • Fact-checking — cross-reference across models with different training data
  • High-stakes decisions — when one model's blind spots could hurt you
  • Contrarian thinking — different models have different biases

When NOT to use:

  • Quick Q&A (too slow, 30-90s latency)
  • Real-time chat (not designed for streaming)
  • Simple lookups (overkill)

Model Configuration

Paid Tier (Default) — Recommended

RoleModel~LatencyStrength
Proposer 1moonshotai/kimi-k2.523sLong context, strong reasoning
Proposer 2z-ai/glm-536sTechnical depth, different training corpus
Proposer 3minimax/minimax-m2.564sNuance catching, thorough analysis
Aggregatormoonshotai/kimi-k2.515sFast synthesis

Why these models?

  • Frontier-class but less congested than GPT-4/Claude
  • Different training data = genuinely different perspectives
  • Chinese models excel at certain reasoning tasks
  • Combined cost still cheaper than single Opus call

Cost breakdown:

3 proposers × ~$0.008 = $0.024
1 aggregator × ~$0.005 = $0.005
─────────────────────────────
Total: ~$0.029/query

Free Tier (Fallback)

5 models: Llama 3.3 70B, Gemini 2.0 Flash, Mistral Small, Nemotron 70B, Qwen 2.5 72B

⚠️ Warning: Free tier hits rate limits during peak hours. Use --free flag only for testing.


How It Works

        ┌─────────────┐
        │   PROMPT    │
        └──────┬──────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│Kimi 2.5│ │ GLM 5  │ │MiniMax │  ← Parallel (they "argue")
│(reason)│ │(depth) │ │(nuance)│
└───┬────┘ └───┬────┘ └───┬────┘
    │          │          │
    └──────────┼──────────┘
               ▼
       ┌──────────────┐
       │  AGGREGATOR  │
       │  (Kimi 2.5)  │
       │              │
       │ • Best of 3  │
       │ • Resolve    │
       │   conflicts  │
       │ • Synthesize │
       └──────┬───────┘
              ▼
       ┌──────────────┐
       │ FINAL ANSWER │
       │ (Synthesized)│
       └──────────────┘

API Reference

Function Signature

interface MoAOptions {
  prompt: string;           // Required: The question to analyze
  tier?: 'paid' | 'free';   // Default: 'paid'
}

interface MoAResult {
  synthesis: string;        // The final aggregated answer
}

// Throws on complete failure (all models down, invalid key)
// Returns partial synthesis if 1-2 models fail
async function handle(options: MoAOptions): Promise<string>

CLI Usage

# Paid tier (default)
node scripts/moa.js "Your complex question"

# Free tier
node scripts/moa.js "Your question" --free

Programmatic Usage

const { handle } = require('./scripts/moa.js');

const synthesis = await handle({ 
  prompt: "Analyze the competitive moats in AI code generation",
  tier: 'paid'
});

console.log(synthesis);

Failure Modes

ScenarioBehavior
1 proposer failsSynthesis from remaining 2 models
2 proposers failSynthesis from 1 model (degraded)
All proposers failReturns error message
Invalid API keyImmediate error with setup instructions
Rate limit (free tier)Returns rate limit error

The system is designed to degrade gracefully. A 2/3 response is still valuable.


Example Use Cases

VC Due Diligence

node scripts/moa.js "Analyze the competitive landscape for AI code generation. \
Who has defensible moats? Who's likely to be commoditized? Be specific."

Technical Evaluation

node scripts/moa.js "Compare RLHF vs DPO vs RLAIF for LLM alignment. \
Which scales better? What are the failure modes of each?"

Market Research

node scripts/moa.js "What are the emerging use cases for embodied AI in 2026? \
Focus on robotics, drones, and autonomous systems. Include specific companies."

Performance Expectations

MetricPaid TierFree Tier
P50 Latency~45s~60s
P95 Latency~90s~120s+
Success Rate>99%~80% (rate limits)
Cost/Query~$0.03$0.00

Tips

  1. Be specific — Vague prompts get vague synthesis
  2. Ask for structure — "Give me pros/cons" or "List top 5" helps the aggregator
  3. Use for analysis, not chat — MoA shines for complex reasoning
  4. Batch your queries — 30-90s per query, so plan accordingly

Installation

Via ClawHub (Recommended)

clawhub install moa

Manual

  1. Copy skills/moa/ to your ~/clawd/skills/ directory
  2. Set OPENROUTER_API_KEY in your environment
  3. The agent can now invoke MoA for complex queries

Environment Variables

VariableRequiredDescription
OPENROUTER_API_KEYYesYour OpenRouter API key

Get your key at: https://openrouter.ai/keys


Credits