UpGrade is an open source A/B testing platform for education software. UpGradeAgent is a chatbot that can make requests to UpGrade's client API endpoints for testing, simulating, and verifying its functionalities based on natural language inputs.
This document describes the MVP design of this chatbot which uses a streamlined 5-node architecture to provide reliable, context-aware conversations about A/B testing operations. The app is built with Python using the LangGraph library and Anthropic Claude Sonnet 4 model (claude-sonnet-4-20250514).
For the MVP, users interact with the chatbot in the Terminal console (it may later become a Slack bot).
The app prioritizes:
- Accuracy over token cost - Reliable understanding and execution
- Intelligent clarification - Asks for clarification on ambiguous queries (e.g., "What's the status?") rather than making assumptions
- Progressive information gathering - Naturally collects missing information through conversation
- Safety first - Always confirms before potentially destructive actions
- Conversational memory - Retains and uses prior context within sessions
UpGradeAgent uses a streamlined 5-node architecture that handles all conversation flows intelligently:
- Conversation Analyzer - Understands user intent with full context
- Information Gatherer - Collects missing information progressively
- Confirmation Handler - Ensures user approval before actions
- Tool Executor - Executes UpGrade API operations safely
- Response Generator - Creates natural, helpful responses
This architecture handles everything from simple greetings to complex multi-step experiment creation without the complexity of traditional router-based approaches.
- Graph Structure - Streamlined 5-node architecture design and implementation
- Tools - API integration and tool specifications
- Example Chats - Sample bot interactions demonstrating conversation patterns
- Core Terms - Essential UpGrade terminology and concepts
- Assignment Behavior - How assignment rules and consistency work together
- API Reference - Complete API endpoints and request/response examples
- LangGraph Reference - LangGraph framework documentation for implementation
Current Structure (Phase 1 Complete):
/src/api/
- ✅ Production-ready UpGrade API client and authentication/src/models/
- ✅ Comprehensive Pydantic models for all API types/src/config.py
- ✅ Environment configuration management/reference/
- ✅ UpGrade controller implementations for verification- Verification reports - ✅ Complete backend compliance documentation
Planned Structure (Phase 2):
/src/agent/
- 🔄 LangGraph nodes and graph definition/src/tools/
- 🔄 LangGraph tools built on API client
- Natural conversation - Handles greetings, general questions, and context-aware follow-ups
- Intelligent clarification - Resolves ambiguous queries like "What's the status?"
- Progressive information gathering - Guides users through complex operations step-by-step
- Error recovery - Handles typos, wrong inputs, and helps users get back on track
- Explain Terms and Concepts - Answer questions about UpGrade terminology and A/B testing
- System Health - Check UpGrade service status and version information
- Experiment Management - List, create, update, delete, and start/stop experiments
- Experiment Details - Fetch and explain experiment configurations in detail
- User Simulation - Simulate users visiting decision points and getting assigned conditions
- Testing & Analysis - Test condition balance and verify consistency rule behavior
GET /
- Health check and version infoGET /experiments/contextMetaData
- Get available app contexts and their supported values
GET /experiments/names
- Get all experiment names and IDsGET /experiments
- Get all experiments with optional filteringGET /experiments/single/<experiment_id>
- Get detailed experiment configurationPOST /experiments
- Create new experimentPUT /experiments/<experiment_id>
- Update experiment configurationPOST /experiments/state
- Update experiment status (start/stop)DELETE /experiments/<experiment_id>
- Delete experiment
POST /v6/init
- Initialize users with group membershipsPOST /v6/assign
- Get experiment condition assignments for usersPOST /v6/mark
- Record decision point visits
The following UpGrade features are not supported in the MVP:
- Within-subjects Experiments (Unit of Assignment)
- Factorial Experiments (Design Type)
- Stratified Random Sampling Experiments (Assignment Algorithm)
- TS Configurable (MOOClet) Experiments (Assignment Algorithm)
- Metrics management and logging (
/v6/log
endpoint) - Feature flags and preview users
- Public segment creation/management
- Advanced analytics and reporting
The bot will politely reject requests for these features and suggest supported alternatives.
User: What is A/B testing?
Bot: A/B testing is a method of comparing two or more versions of something...
User: What's the status?
Bot: I need clarification on what status you're asking about:
1. UpGrade service health status
2. Status of your experiments
3. Explanation of "status" concept
Which one would you like to know about?
User: Create a new experiment
Bot: I'll help you create a new experiment. What would you like to name it?
User: Homepage Test
Bot: Great! Which app context should this experiment run in?
Available: assign-prog, mathstream, livehint-ai
User: assign-prog
Bot: Perfect! Which decision point should this experiment target?...
See Example Chats for complete conversation examples.
Status: Production-ready API client with 100% backend compliance
src/api/upgrade_api.py
- Complete async API client with unified auth handlingsrc/api/auth.py
- Google service account authentication with token refreshsrc/models/api_types.py
- Comprehensive Pydantic models for all API typessrc/config.py
- Environment configuration managementmain.py
- Basic console foundation (echo chat loop)
- 11 API endpoints fully implemented and tested
- 3 authentication patterns (none/bearer/user-id) with enum-based clarity
- Async-first design optimized for LangGraph integration
- Enterprise error handling with retries and structured exceptions
- 100% backend verification against actual UpGrade controllers
- Type safety with Pydantic models and proper optional field handling
- Live API tests passing (5/5 core endpoints)
- Backend compliance verified line-by-line against
reference/
controllers - Documentation with comprehensive verification reports
The API foundation is ready for Phase 2: Building LangGraph tools and agent nodes on top of this solid base.
Upcoming tasks:
- Build LangGraph tools layer using the API client
- Implement the 5-node agent architecture
- Add conversation state management
- Connect tools to agent nodes for full functionality
# Setup environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your UpGrade API URL and service account file path
# Test API foundation
python test_api_live.py
- Review the Architecture: Start with Graph Structure to understand the 5-node design
- Understand the Domain: Read Core Terms for UpGrade concepts
- See It In Action: Browse Example Chats for interaction patterns
- API Implementation: Review verification reports for technical details