research-agent-optimization

// Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

$ git log --oneline --stat

stars:194

forks:37

updated:March 4, 2026

SKILL.mdreadonly

SKILL.md Frontmatter

nameresearch-agent-optimization

descriptionOptimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

Research Agent Optimization

Scope

Project root: /home/bender/classwork/Thesis
Backend: backend/news_research_agent.py, backend/app/api/routes/research.py, backend/app/services/news_research.py
Frontend: frontend/app/search/page.tsx, frontend/lib/api.ts
Configuration: backend/app/core/config.py

Problem Statement

Rate Limiting: Gemini API hits 429 quota exceeded errors during research and article analysis
Web Search: DuckDuckGo tool integration has naming issues (not properly initialized)
Unclear Progress: Research streaming shows generic "Still working..." instead of specific tool calls
JSON in Response: Results show raw JSON blocks instead of formatted source cards
Redundant API Calls: Multiple internal search calls without caching/deduplication

Required Outcomes

Graceful rate limit handling with exponential backoff and quota monitoring
Working web search tool with proper DuckDuckGo initialization
Verbose streaming events showing real tool execution (web_search, news_search, internal_news_search)
Research results rendered with inline source cards (not JSON blocks)
Optimized API calls: batch searches, cache semantic results, reuse internal knowledge base
Clear error messages when quota is exceeded

Workflow

1. API Call Optimization

Implement request batching in search_internal_news tool
Add caching layer for semantic search results (avoid duplicate queries within 5min window)
Combine web_search + news_search into single result set
Track API call counts per session and warn before quota exhaustion
Add exponential backoff retry logic (1s, 2s, 4s, 8s max)

Files:

backend/news_research_agent.py - tools and caching
backend/app/services/news_research.py - request batching helpers

2. Rate Limit & Quota Handling

Add try/catch wrapper around Gemini calls
Detect 429 errors and return user-friendly message ("API Rate Limit: ...please wait a moment...")
Add optional --skip-gemini-analysis mode for article analysis when quota is low
Log quota usage and remaining tokens
Set model to gemini-2.0-flash (faster, lower token cost) instead of gemini-2.0-flash-exp

Files:

backend/app/core/config.py - error handling wrapper, model selection
backend/app/api/routes/research.py - HTTP error responses
backend/news_research_agent.py - LLM call error handling

3. Web Search Tool Fix

Verify DuckDuckGo import: from duckduckgo_search import DDGS (not ddgs or DuckDuckGo)
Ensure web_search and news_search tools are properly bound to LLM
Add fallback to internal search if web search fails
Log tool execution with query and result count

Files:

backend/news_research_agent.py - tool definitions and error handling
Use exa-code to verify current DuckDuckGo API patterns

4. Streaming Progress Clarity

Expand SSE event types: tool_start includes tool name + query parameters
Map tool events to user-friendly messages:
- web_search("climate change") → "Searching web for: climate change..."
- news_search(keywords="COP30") → "Searching news for: COP30..."
- search_internal_news(query) → "Searching internal knowledge base..."
- fetch_article_content(url) → "Reading article: [title/domain]..."
Add timestamps and tool execution duration
Emit status updates every 3-5 seconds if no tool activity

Files:

backend/news_research_agent.py - streaming generator
backend/app/api/routes/research.py - SSE formatting

5. Frontend Result Rendering

Remove JSON blocks from response text
Render referenced articles in a "Sources" section below the answer
Use article cards: title, source, date, image thumbnail
Make cards clickable to open article detail modal
Group sources by retrieval method (semantic, web search, internal)

Files:

frontend/app/search/page.tsx - message rendering and sources grid
frontend/lib/api.ts - response parsing

6. Error Handling & User Feedback

Detect and handle:
- 429 quota exceeded → "API Rate Limit: The AI service has reached its rate limit. Please wait a moment and try again."
- Connection timeout → "Request Timeout: The research took too long. Try a simpler query."
- Tool execution failure → "Tool [name] failed: [reason]. Continuing with alternative search..."
Add retry prompt on error (not automatic, user chooses)
Log all errors with request ID for debugging

Files:

backend/app/api/routes/research.py - error formatting
frontend/app/search/page.tsx - error UI and retry logic

Checks

API Optimization

Verify semantic search results are cached (no duplicate calls)
Check web_search and news_search return results (not empty)
Confirm tool execution logs show cache hits for repeated queries

Rate Limit Handling

Trigger 429 error and verify graceful fallback message displays
Confirm no stack traces shown to user
Check logs show quota status and retry timing

Web Search

Query "climate change" and verify web_search returns 5+ results
Confirm DuckDuckGo DDGS class is properly instantiated
Check news_search returns recent news articles

Streaming Clarity

Monitor SSE events for tool_start with query details
Verify timestamps increment correctly
Confirm "Still working..." message only shows after 30s inactivity

Frontend Rendering

Verify research answer is plain text (no JSON)
Check "Sources" section appears with article cards
Confirm card click opens article detail modal
Verify no duplicate sources (de-duplication working)

Error Scenarios

Submit invalid query and verify doesn't crash
Test with network disconnect and check timeout message
Simulate quota exceeded (403) and verify user sees rate limit message

Implementation Checklist

Add retry decorator with exponential backoff to Gemini client
Implement request cache in search_internal_news with 5min TTL
Fix DuckDuckGo tool initialization (verify DDGS import)
Update research_stream() to emit granular tool start/result events
Map tool events to human-readable status messages in API endpoint
Remove JSON block from final answer text
Add "Sources" section with article cards to frontend
Update error handling for 429 quota exceeded
Add streaming status animation to UI
Write tests for quota handling and web search integration