- System Overview
- Architecture
- Backend Components
- Frontend Components
- Knowledge Base System
- Setup and Installation
- API Reference
The AI Data Analyst is an intelligent system that combines natural language processing, machine learning, and data visualization to provide intuitive data analysis capabilities. It allows users to query complex data using natural language and automatically generates optimized SQL queries for ClickHouse database.
- Natural language to SQL conversion
- Intelligent query optimization
- Interactive data visualization
- Continuous learning system
- Real-time performance monitoring
- Backend: Python FastAPI
- Frontend: Vue.js 3.x
- Database: ClickHouse
- AI Service: Azure OpenAI
- Visualization: Chart.js
┌─────────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Frontend │────▶│ Backend │────▶│ ClickHouse │
│ (Vue.js) │◀────│ (FastAPI) │◀────│ Database │
└─────────────────┘ └──────────────────┘ └────────────────┘
│ ▲
▼ │
┌──────────────────┐
│ Knowledge Base │
│ System │
└──────────────────┘
-
Query Generation
def generate_query(question: str) -> str: # Converts natural language to SQL # Uses AI to understand context # Returns optimized SQL query
-
Query Optimization
def optimize_query(query: str) -> str: # Applies performance optimizations # Adds necessary limits and filters # Ensures ClickHouse compatibility
-
Query Execution
def execute_query(query: str) -> Dict: # Executes the query safely # Handles data type conversions # Returns formatted results
-
POST /analyze
- Processes natural language queries
- Returns query results and visualizations
- Includes performance metrics
-
GET /tables
- Returns database schema information
- Includes table statistics
- Shows available columns and types
-
POST /analyze_query
- Analyzes SQL queries without execution
- Provides query optimization suggestions
- Shows estimated performance metrics
- Handles user interactions
- Manages application state
- Coordinates component communication
// Supports multiple chart types
- Bar charts
- Line charts
- Area charts
- Scatter plots
- Dynamic visualization selection
- Responsive design
- Interactive data exploration
-
Query Input
- Natural language input field
- Query history
- Suggestion system
-
Results Display
- Interactive charts
- Data tables
- Export options
-
Performance Metrics
- Query execution time
- Data volume metrics
- Optimization suggestions
CREATE TABLE knowledge_base (
timestamp DateTime,
knowledge_id String,
raw_knowledge String,
analysis String,
patterns String
) ENGINE = MergeTree()
ORDER BY (timestamp, knowledge_id)
-
Pattern Recognition
- Query pattern analysis
- Data relationship identification
- Business rule extraction
-
Knowledge Application
- Pattern matching
- Query optimization
- Result enhancement
-
Continuous Learning
- User interaction analysis
- Performance optimization
- Pattern refinement
- Python 3.8+
- Node.js 14+
- ClickHouse Database
- Azure OpenAI API access
# Install Python dependencies
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env with your configurations
# Start the backend server
uvicorn app.main:app --reload
# Install dependencies
cd vue-frontend
npm install
# Start development server
npm run dev
# Build for production
npm run build
POST /analyze
Content-Type: application/json
{
"question": "Show me total sales by product for last month"
}
GET /tables
POST /analyze_query
Content-Type: application/json
{
"query": "SELECT * FROM sales LIMIT 100"
}
- Automatic limit addition
- Index utilization
- Join optimization
- Aggregation optimization
- Schema caching
- Query result caching
- Knowledge base caching
- Role-based access control
- Query validation
- Input sanitization
- Authentication
- Rate limiting
- CORS configuration
-
Connection Issues
- Check database connectivity
- Verify API credentials
- Check network settings
-
Performance Issues
- Monitor query complexity
- Check data volume
- Review cache usage
-
Query Generation Issues
- Verify knowledge base
- Check question formatting
- Review error logs
-
Code Style
- Follow PEP 8 for Python
- Use Vue.js style guide
- Document all functions
-
Testing
- Write unit tests
- Perform integration testing
- Test with sample data
-
Documentation
- Update API documentation
- Document new features
- Maintain changelog
This project is licensed under the MIT License - see the LICENSE file for details.
Last updated: February 25, 2025