This is a two-layer system that breaks down complex queries into parallel subtasks executed by dedicated agents:
- Layer 1 (Orchestrator): Top-level LangChain orchestrator that calls
search_worker_poolvia MCP. - Layer 2 (SearchAgent): Executes concrete search tasks.
- 🚀 High concurrency: Up to 50 parallel search workers
- 🏗️ Two-layer architecture: Orchestrator → SearchWorkers
- 🔧 MCP integration: Process isolation via Model Context Protocol
- 🎯 Specialized execution: Search-only
- 🛡️ Resilience: Graceful fallback and retry logic
from search_agent.runtime import create_orchestrator
from search_agent.configuration import SearchAgentConfig
from search_agent.shared import RunPaths
from pathlib import Path
# Create config
config = SearchAgentConfig()
# Create run paths
paths = RunPaths(
internal_root_dir=Path("./cache"),
external_root_dir=Path("./cache"),
run_suffix="test",
internal_run_dir=Path("./cache/test"),
external_run_dir=Path("./cache/test"),
)
# Create Orchestrator (connects to search_worker_pool)
orchestrator = await create_orchestrator(config=config, paths=paths)
# Run a query
result = await orchestrator.run("Compare the top 5 AI frameworks in a table")
# Or stream updates
async for chunk in orchestrator.stream("A complex multi-step query..."):
print(chunk)
# Cleanup
await orchestrator.close()SearchAgent/
├── README.md # README
├── ARCHITECTURE.md # Detailed architecture
├── requirements.txt # Python dependencies
├── pool_config.yaml # Worker pool config
└── src/
└── search_agent/
├── orchestration/ # Orchestration
│ └── orchestrator.py # Orchestrator (connects to search_worker_pool)
├── coordination/ # Helper utilities
│ └── _worker_wrapper.py
├── execution/ # Execution layer
│ └── search_executor.py
├── infrastructure/ # Infrastructure
│ └── firecrawl-mcp-server/ # Firecrawl MCP Server
├── configuration/ # Configuration
├── runtime/ # Runtime services
└── shared/ # Shared code
See ARCHITECTURE.md for full details.
- Orchestration: LangChain (Orchestrator)
- Execution: LangChain (SearchAgent)
- Transport: MCP (Model Context Protocol)
- Parallelism: asyncio.gather (managed inside search_worker_pool)
- External service: Firecrawl (search)
pip install -r requirements.txtThe Firecrawl MCP server is located at src/search_agent/infrastructure/firecrawl-mcp-server/.
Manual install
cd src/search_agent/infrastructure/firecrawl-mcp-server
rm -rf node_modules package-lock.json
npm install
npm run buildTroubleshooting: If you see Cannot find module '../lib/tsc.js', delete node_modules and package-lock.json and reinstall.
Edit pool_config.yaml to adjust pool size:
pools:
search:
max_pool_size: 50export OPENAI_API_KEY="your-api-key"
export FIRECRAWL_API_KEY="your-firecrawl-key"Get one at https://www.firecrawl.dev/app/api-keys
See notebooks under examples/:
examples/agents/search_agent_test.ipynb- SearchAgent usageexamples/managers/search_manager_test.ipynb- Orchestrator end-to-end test
See ARCHITECTURE.md for:
- Full directory structure
- Core components
- Data flow
- Naming conventions
- Usage examples