This project implements a conversational AI chatbot using LangGraph for conversation management and Chainlit for the web interface. The chatbot features automatic conversation summarization to maintain context while keeping memory usage efficient.
- LangGraph Workflow: Uses LangGraph for managing conversation state and flow
- Automatic Summarization: Automatically summarizes conversations when they exceed 6 messages
- Memory Management: Efficiently manages conversation history using checkpoints
- Modern Web Interface: Beautiful Chainlit-based chat interface
- Ollama Integration: Uses local Ollama models (phi3:mini by default)
The system consists of two main components:
-
app.py: Contains the core LangGraph workflow logic- Conversation state management
- Automatic summarization
- Message processing pipeline
-
chainlit.py: Provides the web interface- User-friendly chat interface
- Session management
- Integration with the LangGraph workflow
- Python 3.8+
- Ollama installed and running locally
- phi3:mini model downloaded (
ollama pull phi3:mini)
- Clone the repository and navigate to the project directory:
cd chatbot_with_summarization- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Ensure Ollama is running and the model is available:
ollama serve
ollama pull phi3:miniStart the Chainlit chatbot:
chainlit run chainlit.pyThe chatbot will be available at http://localhost:8000
You can also test the core LangGraph workflow directly:
python app.py- Conversation Start: When a user starts chatting, a unique thread ID is created
- Message Processing: Each message is processed through the LangGraph workflow
- Automatic Summarization: When the conversation exceeds 6 messages, the system automatically summarizes the conversation
- Memory Cleanup: Old messages are removed, keeping only the most recent ones
- Context Preservation: The summary maintains conversation context for future interactions
You can modify the model settings in app.py:
def create_model():
return ChatOllama(
model="phi3:mini", # Change model here
temperature=0.8, # Adjust creativity
num_predict=256, # Adjust response length
)Adjust when summarization occurs by modifying the should_continue function:
def should_continue(state: State) -> Literal["summarize_conversation", END]:
messages = state["messages"]
if len(messages) > 6: # Change this number
return "summarize_conversation"
return ENDCustomize the interface appearance in .chainlit/config.toml:
- Theme (light/dark)
- Assistant name and description
- Chat features and settings
chatbot_with_summarization/
├── app.py # Core LangGraph workflow
├── chainlit.py # Chainlit web interface
├── requirements.txt # Python dependencies
├── .chainlit/ # Chainlit configuration
│ └── config.toml
└── README.md # This file
- Ollama Connection Error: Ensure Ollama is running (
ollama serve) - Model Not Found: Download the required model (
ollama pull phi3:mini) - Port Already in Use: Chainlit uses port 8000 by default. Change it with
chainlit run chainlit.py --port 8001
- Use smaller models for faster responses
- Adjust the summarization threshold based on your needs
- Monitor memory usage with very long conversations
Feel free to submit issues and enhancement requests!
This project is open source and available under the MIT License.