This project implements an AI agent for Databricks account teams to create use case plans for customer migrations and greenfield scenarios. The agent uses DSPy (Declarative Self-improving Python) framework with MLflow's built-in conversation management, providing conversational AI capabilities through a chat interface to intelligently guide account teams through use case planning.
Figure 1: High-level system architecture showing the multi-agent compound AI system, web application, model serving, and data pipeline components.
graph TB
subgraph "Chat UI Layer"
UI[React Chat UI]
API[Express.js Backend]
end
subgraph "Databricks Model Serving"
EP[Model Serving Endpoint]
end
subgraph "MLflow Model Registry"
MODEL[Registered Model]
VERSION[Model Version]
end
subgraph "Simplified Agent - SimplifiedMigrationPlanningAgent"
AGENT[SimplifiedMigrationPlanningAgent]
subgraph "Conversation Management"
CM[SimplifiedConversationManager]
INTENT[Intent Classification]
GREET[Greeting Handler]
INFO[Information Collector]
end
subgraph "Planning Modules"
QGEN[Question Generator]
GAP[Gap Analysis]
PLAN[Plan Generator]
TABULAR[Tabular Plan Generator]
end
subgraph "DSPy Signatures"
ISIG[IntentClassifierSignature]
GSIG[GreetingSignature]
ICSIG[InformationCollectorSignature]
QGSIG[QuestionGeneratorSignature]
GASIG[GapAnalysisSignature]
PGSIG[PlanGeneratorSignature]
TPGSIG[TabularPlanGeneratorSignature]
end
end
subgraph "External Services"
DB[Databricks Claude Sonnet 4]
end
UI --> API
API --> EP
EP --> MODEL
MODEL --> VERSION
VERSION --> AGENT
AGENT --> CM
CM --> INTENT
CM --> GREET
CM --> INFO
CM --> QGEN
CM --> GAP
CM --> PLAN
CM --> TABULAR
INTENT --> ISIG
GREET --> GSIG
INFO --> ICSIG
QGEN --> QGSIG
GAP --> GASIG
PLAN --> PGSIG
TABULAR --> TPGSIG
ISIG --> DB
GSIG --> DB
ICSIG --> DB
QGSIG --> DB
GASIG --> DB
PGSIG --> DB
TPGSIG --> DB
DSPy signatures define the input/output structure and behavior for each AI task:
IntentClassifierSignature: Classifies user intent (greeting, providing_context, answering_questions, feedback_request, plan_generation)GreetingSignature: Handles greetings and explains capabilities for Databricks account teamsInformationCollectorSignature: Extracts and structures customer use case informationQuestionGeneratorSignature: Generates exactly 3 relevant questions for gathering customer informationGapAnalysisSignature: Analyzes information completeness and identifies gapsPlanGeneratorSignature: Generates comprehensive tabular use case plansTabularPlanGeneratorSignature: Creates detailed implementation plans with phases, activities, and timelines
The SimplifiedConversationManager handles all conversation logic using MLflow's built-in context management:
class SimplifiedConversationManager:
def __init__(self, lm, rm):
# Initialize DSPy components with selective Chain of Thought strategy
self.intent_classifier = dspy.Predict(IntentClassifierSignature)
self.greeting_handler = dspy.Predict(GreetingSignature)
self.information_collector = dspy.Predict(InformationCollectorSignature)
self.question_generator = dspy.ChainOfThought(QuestionGeneratorSignature)
self.gap_analyzer = dspy.ChainOfThought(GapAnalysisSignature)
self.plan_generator = dspy.ChainOfThought(PlanGeneratorSignature)
self.tabular_plan_generator = dspy.ChainOfThought(TabularPlanGeneratorSignature)
def process_user_input(self, user_input: str, context: Optional[Dict], user_id: str, conversation_id: str):
"""Process user input with conversation context"""
# Classify intent and route to appropriate handler
# Update conversation state
# Return structured responseThe simplified orchestration agent using MLflow's ResponsesAgent:
class SimplifiedMigrationPlanningAgent(ResponsesAgent):
def __init__(self):
super().__init__()
self.conversation_manager = SimplifiedConversationManager(lm, rm)
def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
"""Process request using MLflow's built-in context management"""
# Extract user input and context
# Process with conversation manager
# Return MLflow-compatible responseUser Input → Chat UI → Express.js Backend → Model Serving → SimplifiedMigrationPlanningAgent
↓
Response ← Chat UI ← Express.js Backend ← Model Serving ← MLflow ResponsesAgentResponse
User Input → Intent Classification → Route to Handler:
↓
- Greeting Handler (for greetings)
- Information Collector (for context/answers)
- Question Generator (for follow-up questions)
- Gap Analyzer (for feedback requests)
- Plan Generator (for plan generation)
↓
Response Generation → MLflow ResponsesAgentResponse
The agent guides account teams through these planning categories:
- Resource & Team: Team composition, skills, roles, training
- Customer Background & Drivers: Business drivers, timeline, constraints, cloud adoption
- Technical Scope & Architecture: Data volume, pipelines, architecture, migration approach
- Current Process Maturity: Existing processes, tools, practices, governance
- Performance & Scalability: Performance requirements, scaling needs, bottlenecks
- Security & Compliance: Security requirements, compliance standards, access control
- Natural Language Understanding: Account teams can start with "I'm working with a customer who wants to migrate from Oracle"
- Context Awareness: Maintains conversation state and customer context using MLflow's built-in management
- Intent Recognition: Understands different user intents (greeting, providing_context, answering_questions, feedback_request, plan_generation)
- Intelligent Responses: Generates helpful, contextual responses for account teams
- Category-based Questions: Asks exactly 3 relevant questions for each planning category
- Context Building: Builds understanding progressively through conversation
- Plan Generation: Creates comprehensive tabular use case plans when ready
- Gap Analysis: Tracks information completeness and identifies missing areas
- MLflow Integration: Uses MLflow's built-in conversation management
- Reduced Complexity: ~70% less code than the original implementation
- Better Performance: Selective Chain of Thought strategy for optimal performance
- Easier Debugging: Built-in debug information and MLflow's native tracing
- Model Registry: Models are registered in Unity Catalog
- Model Serving: Deployed to Databricks Model Serving
- Versioning: Supports model versioning and A/B testing
- Monitoring: Built-in model monitoring and logging
{
"input": [
{
"role": "user",
"content": "I'm working with a customer who wants to migrate from Oracle to Databricks"
}
],
"context": {
"conversation_id": "conv_123",
"user_id": "user_456"
}
}{
"output": [
{
"id": "output_789",
"type": "output_text",
"content": [
{
"type": "output_text",
"text": "Hello! I'm here to help you create a comprehensive use case plan for your customer's migration from Oracle to Databricks.\n\nTo help me create the best plan, could you tell me more about:\n\n1. How many team members are there and what are their roles?\n2. Are the teams sufficiently skilled/trained in Databricks?\n3. Are they using Professional Services or System Integrators?\n\n*Category: Resource & Team*"
}
]
}
]
}bundles/
├── ai-agent/
│ ├── databricks.yml # Bundle configuration
│ └── notebooks/
│ ├── mvp_migration_planning_agent.py
│ └── simplified_migration_planning_agent.py
└── data-pipeline/
├── databricks.yml
└── notebooks/
├── process_documents.py
└── create_vector_index.py
- Training: Notebook trains and registers model in MLflow
- Serving: Model deployed to Databricks Model Serving
- UI Integration: Express.js backend queries serving endpoint
- Chat Interface: React frontend provides conversational UI
- DSPy: Core AI framework
- Databricks Claude Sonnet 4: Language model
- MLflow: Model management and conversation tracking
- Express.js: Backend API
- React: Frontend UI
DATABRICKS_TOKEN: Databricks access token for API authenticationPORT: Server port (default: 5000)NODE_ENV: Environment (development/production)
variables:
catalog_name: "vbdemos"
schema_name: "usecase_agent"
agent_model: "databricks/databricks-claude-sonnet-4"
temperature: "0.1"
max_tokens: "2048"
mlflow_experiment_name: "/Users/[email protected]/usecase-agent"- Language Model:
databricks/databricks-claude-sonnet-4 - Temperature: 0.1 (for consistent responses)
- Max Tokens: 2048
- DSPy Strategy: Selective Chain of Thought (simple tasks use
dspy.Predict, complex tasks usedspy.ChainOfThought)
# Deploy the bundle
databricks bundle deploy
# Run the simplified agent training
databricks bundle run simplified_migration_planning_agent_job# Test via Databricks UI
# Go to Serving → simplified-migration-planning-agent → Test
# Input: {"input": [{"role": "user", "content": "I'm working with a customer who wants to migrate from Oracle to Databricks"}], "context": {"conversation_id": "test_123", "user_id": "test_user"}}# Set up environment variables
export DATABRICKS_TOKEN=your_token_here
# Start the backend
cd ui-backend
npm install
npm start
# Start the frontend (in another terminal)
cd ui-frontend
npm install
npm start
# Access at http://localhost:3000# Use the simplified server for better MLflow integration
cd ui-backend
node simplified_server.js- MLflow Integration: Built-in conversation management and context tracking
- Better Performance: Selective Chain of Thought strategy for optimal speed
- Conversational: Natural language interaction for account teams
- Intelligent: Context-aware responses with customer focus
- Progressive: Builds understanding step by step through structured questions
- Comprehensive: Covers all use case planning aspects
- Scalable: Built on Databricks platform with Unity Catalog
- Extensible: Easy to add new capabilities and question categories
- Removed Custom Session Management: Uses MLflow's built-in context management
- Simplified Request Handling: Uses standard ResponsesAgentRequest structure
- Cleaner Architecture: Single ConversationManager class with in-memory storage
- Better MLflow Integration: Leverages MLflow's conversation tracking
- Enhanced DSPy Signatures: Improved with detailed examples and better descriptions
- Fixed DSPy Configuration: Proper configuration for model serving compatibility
- Reduced Complexity: Easier to maintain and debug
- Multi-language Support: Support for different languages
- Custom Categories: User-defined planning categories
- Template Library: Pre-built use case templates
- Integration: Connect with CRM and project management tools
- Analytics: Track planning progress and outcomes
- Collaboration: Multi-user planning sessions
- Vector Search: Add document retrieval for enhanced knowledge base
This simplified architecture provides a robust, scalable, and maintainable solution for AI-powered use case planning with conversational capabilities specifically designed for Databricks account teams.
