A self-improving QA agent that automatically tests web applications, identifies bugs, applies fixes, and verifies the fixes – all without human intervention.
QAgent is a multi-agent system that creates a closed-loop for automated bug detection and fixing:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ TESTER │───▶│ TRIAGE │───▶│ FIXER │───▶│ VERIFIER │
│ Agent │ │ Agent │ │ Agent │ │ Agent │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │
│ ┌──────────────┐ │
│ │ Redis │◀────────────────┘
│ │ (Knowledge │
│ │ Base) │
│ └──────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────┐
│ W&B Weave (Observability) │
└─────────────────────────────────────────────────────────┘
- Continuous Testing: Runs E2E tests like a QA engineer, simulating real user flows
- Automatic Bug Fixing: Doesn't just report bugs – it fixes them and redeploys
- Self-Improvement: Learns from past bugs to diagnose and fix faster over time
- Measurable Impact: Track pass rates, time-to-fix, and iterations to prove improvement
# Clone the repository
git clone https://github.com/rishabhcli/QAgent.git
cd QAgent
# Install dependencies
pnpm install
# Set up environment
cp .env.example .env.local
# Edit .env.local with your API keys
# Run development server (demo app)
pnpm dev
# Run the QAgent agent
pnpm run agent
# Start the Marimo dashboard
marimo run dashboard/app.py| Technology | Purpose |
|---|---|
| Browserbase + Stagehand | AI-powered browser automation for E2E testing |
| Vercel | Instant deployment after fixes |
| Redis | Vector knowledge base for learning from past bugs |
| W&B Weave | Tracing and evaluation of agent runs |
| Custom Orchestrator (ADK/A2A-compatible) | Multi-agent workflow coordination (ADK integration planned) |
| Marimo | Interactive analytics dashboard |
| Next.js | Demo application |
| OpenAI | LLM for patch generation |
| File | Purpose |
|---|---|
| CLAUDE.md | Agent configuration, tech stack, workflow rules |
| TASKS.md | Phase-scoped task tracker |
| docs/PRD.md | Product Requirements Document |
| docs/DESIGN.md | System design and data structures |
| docs/ARCHITECTURE.md | Architecture Decision Records |
| prompts/ralph-loop.md | Development workflow prompts |
QAgent/
├── .claude/
│ └── skills/ # Domain-specific knowledge modules
│ ├── browserbase-stagehand/
│ ├── redis-vectorstore/
│ ├── vercel-deployment/
│ ├── wandb-weave/
│ ├── google-adk/
│ ├── marimo-dashboards/
│ └── qagent-agents/
├── agents/ # Agent implementations
│ ├── tester/
│ ├── triage/
│ ├── fixer/
│ ├── verifier/
│ └── orchestrator/
├── app/ # Next.js demo app
├── dashboard/ # Marimo analytics
├── docs/ # Documentation
├── lib/ # Shared libraries
├── prompts/ # Workflow prompts
└── tests/ # Test suites
- Test - Tester Agent runs E2E tests using Browserbase/Stagehand
- Detect - Failures are captured with screenshots, DOM state, logs
- Diagnose - Triage Agent analyzes the failure and queries Redis for similar issues
- Fix - Fixer Agent generates a patch using LLM + past fix patterns
- Deploy - Verifier Agent applies the patch and deploys via Vercel
- Verify - Tests are re-run to confirm the fix works
- Learn - Successful fixes are stored in Redis for future reference
- Repeat - Loop continues until all tests pass
- Knowledge Base: Every bug and fix is stored with embeddings for semantic search
- Pattern Learning: Similar bugs are fixed faster using past solutions
- TraceTriage: Agent failures are analyzed to improve prompts and workflows
- RedTeam: Adversarial tests continuously harden the system
- Start every session by reading CLAUDE.md
- Check current work in TASKS.md
- Follow the Ralph Loop workflow for iterative development
- Load skills from
.claude/skills/as needed
# Install dependencies
pnpm install
# Run demo app
pnpm dev
# Run agent
pnpm run agent
# Run tests
pnpm test
# Run E2E tests
pnpm run test:e2e
# Lint and format
pnpm lint && pnpm format
# Build
pnpm buildSee .env.example for required environment variables:
BROWSERBASE_API_KEY- Browserbase API keyOPENAI_API_KEY- OpenAI API keyREDIS_URL- Redis connection stringVERCEL_TOKEN- Vercel API tokenWANDB_API_KEY- Weights & Biases API keyGOOGLE_CLOUD_PROJECT- Google Cloud project (reserved for ADK/A2A integration)
See the Quick Start section above for setup instructions. Once running, connect a GitHub repository through the dashboard and start your first QAgent run.
- QAgent Paper - Agentic patching framework
- Stagehand - AI browser automation
- Browserbase - Cloud browsers
- W&B Weave - LLM observability
- Google ADK - Planned orchestration framework
- Marimo - Reactive notebooks