Skip to content

ahmetgkdemr/ChatBotAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TeknoMarket AI Customer Support Chatbot

A full-stack, locally-hosted AI customer support chatbot built with RAG (Retrieval-Augmented Generation). It answers customer questions by searching a vector knowledge base and generating natural language responses — all running on your machine with zero API costs.

Python Angular FastAPI PostgreSQL Ollama Docker


Features

  • RAG Pipeline — Retrieves relevant knowledge chunks via vector similarity before generating answers, ensuring accurate and grounded responses
  • 100% Local & Free — Uses Ollama with Gemma 3 4B for LLM inference and sentence-transformers for embeddings — no API keys or cloud services required
  • Vector Search — PostgreSQL + pgvector for fast cosine similarity search across 384-dimensional embeddings
  • Real-time Streaming — Server-Sent Events (SSE) deliver token-by-token responses to the UI as the LLM generates them
  • 50-Item Knowledge Base — Pre-built Turkish FAQ covering 14 categories (shipping, returns, payments, account management, etc.)
  • Modern Chat UI — Full-page Angular 17 chat interface with typing indicators, source citations, quick-start questions, and responsive design
  • Multi-tenant Readytenant_id field supports multiple knowledge bases in a single database

Architecture

┌─────────────────┐     HTTP/SSE      ┌─────────────────────────────────────┐
│                 │  ───────────────►  │            FastAPI Backend          │
│   Angular 17    │                    │                                     │
│   Chat UI       │  ◄───────────────  │  ┌───────────┐   ┌──────────────┐  │
│   (port 4200)   │    SSE stream      │  │  RAG       │   │  Ollama      │  │
│                 │                    │  │  Service   │──►│  (Gemma 3)   │  │
└─────────────────┘                    │  └─────┬─────┘   └──────────────┘  │
                                       │        │                            │
                                       │  ┌─────▼─────────────────────────┐  │
                                       │  │  sentence-transformers        │  │
                                       │  │  (all-MiniLM-L6-v2, 384d)    │  │
                                       │  └─────┬────────────────────────┘  │
                                       │        │                            │
                                       │  ┌─────▼─────┐                     │
                                       │  │ PostgreSQL │                     │
                                       │  │ + pgvector │                     │
                                       │  └───────────┘                     │
                                       └─────────────────────────────────────┘

How a query flows:

  1. User sends a message from the Angular chat UI
  2. Backend embeds the query using sentence-transformers (384-dim vector)
  3. pgvector performs cosine similarity search against the knowledge base
  4. Top-5 most relevant chunks are retrieved as context
  5. Context + question are sent to Ollama (Gemma 3 4B) with a system prompt
  6. Response streams back token-by-token via SSE to the frontend

Tech Stack

Layer Technology Purpose
Frontend Angular 17 (standalone components) Chat UI with SSE streaming
Backend FastAPI + Python 3.11 Async REST API + SSE endpoints
LLM Ollama (Gemma 3 4B) Local language model inference
Embeddings sentence-transformers (all-MiniLM-L6-v2) Local 384-dim text embeddings
Database PostgreSQL 16 + pgvector Vector similarity search
Infra Docker Compose Database container orchestration

Project Structure

ChatBotAI/
├── docker-compose.yml           # PostgreSQL + pgvector container
├── knowledge-base.json          # 50-item Turkish FAQ knowledge base
│
├── backend/
│   ├── Dockerfile               # Python API container
│   ├── requirements.txt         # Python dependencies
│   ├── seed_knowledge.py        # Seeds knowledge base into pgvector
│   ├── .env.example             # Environment variables template
│   └── app/
│       ├── main.py              # FastAPI app, CORS, lifespan
│       ├── config.py            # Pydantic settings from .env
│       ├── database.py          # Async SQLAlchemy + pgvector setup
│       ├── models/
│       │   └── schemas.py       # Request/response Pydantic models
│       ├── api/
│       │   ├── chat.py          # POST /api/chat, POST /api/chat/stream
│       │   └── knowledge.py     # CRUD endpoints for knowledge base
│       └── services/
│           ├── embedding_service.py   # Singleton sentence-transformer
│           ├── rag_service.py         # Vector search + context builder
│           └── claude_service.py      # Ollama LLM client (OpenAI-compatible)
│
└── frontend/
    ├── angular.json             # Angular CLI config with proxy
    ├── proxy.conf.json          # Dev proxy → backend:8000
    └── src/app/
        ├── app.component.ts     # Root component
        ├── models/
        │   └── chat.models.ts   # TypeScript interfaces
        ├── services/
        │   └── chat.service.ts  # HTTP + SSE streaming client
        └── components/
            └── chat-widget/     # Full-page chat component
                ├── chat-widget.component.ts
                ├── chat-widget.component.html
                └── chat-widget.component.scss

Prerequisites

  • Docker Desktop — for PostgreSQL + pgvector
  • Python 3.11+ — for the backend
  • Node.js 18+ and Angular CLI 17+ — for the frontend
  • Ollama — for local LLM inference

Getting Started

1. Clone the repository

git clone https://github.com/ahmetgkdemr/ChatBotAI.git
cd ChatBotAI

2. Start PostgreSQL with pgvector

docker compose up db -d

Wait for the health check to pass (~5 seconds).

3. Install and run Ollama

Download Ollama and pull the model:

ollama pull gemma3:4b

Make sure Ollama is running (it starts automatically after installation).

4. Set up the backend

cd backend
pip install -r requirements.txt
cp .env.example .env

Seed the knowledge base into pgvector:

python -X utf8 seed_knowledge.py

Start the API server:

uvicorn app.main:app --reload --port 8000

5. Set up the frontend

cd frontend
npm install
ng serve

6. Open the app

Navigate to http://localhost:4200 — the chat interface will be ready.


API Endpoints

Method Endpoint Description
GET /api/health Health check
POST /api/chat Synchronous chat response
POST /api/chat/stream SSE streaming chat response
GET /api/knowledge List knowledge base entries
GET /api/knowledge/stats Category distribution stats
POST /api/knowledge Add new knowledge entry
DELETE /api/knowledge/{id} Delete knowledge entry

Example: Chat Request

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "How can I track my order?", "tenant_id": "demo"}'

Example: Response

{
  "response": "You can track your order from 'My Account > My Orders' page...",
  "sources": [
    {
      "question": "How do I track my order?",
      "category": "Order Tracking",
      "score": 0.8923
    }
  ]
}

Knowledge Base

The knowledge base (knowledge-base.json) contains 50 Turkish FAQ entries across 14 categories:

Category Count Topics
Account Management 5 Registration, password reset, 2FA, profile
Shipping 5 Tracking, delivery times, free shipping
Returns & Refunds 5 Return policy, refund process, conditions
Payment Methods 5 Credit card, bank transfer, installments
Order Management 4 Cancellation, modification, bulk orders
Product Info 3 Warranty, specs, authenticity
Campaigns & Discounts 3 Promo codes, loyalty program
Technical Support 3 App issues, notifications
Privacy & Security 3 Data protection, KVKK compliance
Corporate Sales 3 B2B, invoicing
Gift Cards 2 Purchase, redemption
Marketplace 3 Third-party sellers
Accessibility 2 Accessibility features
Sustainability 2 Eco-friendly packaging

Configuration

All configuration is managed through environment variables in backend/.env:

Variable Default Description
OLLAMA_BASE_URL http://localhost:11434/v1 Ollama API endpoint (OpenAI-compatible)
OLLAMA_MODEL gemma3:4b LLM model name in Ollama
DATABASE_URL postgresql+asyncpg://... PostgreSQL connection string
EMBEDDING_MODEL all-MiniLM-L6-v2 sentence-transformers model name

Using a different LLM

You can swap the LLM by changing OLLAMA_MODEL in .env to any Ollama-supported model:

# Larger, higher quality
OLLAMA_MODEL=llama3.1:8b

# Smaller, faster
OLLAMA_MODEL=gemma3:1b

# Multilingual
OLLAMA_MODEL=qwen2.5:7b

Screenshots

Chat Interface

The full-page chat interface features:

  • Welcome screen with quick-start question buttons
  • User messages (right, blue) and AI responses (left, gray)
  • Real-time streaming with typing indicator
  • Source category badges on each response
  • Responsive layout (mobile full-screen, desktop centered card)

License

This project is for educational and demonstration purposes.

About

RAG-based AI customer support chatbot — FastAPI, Angular 17, PostgreSQL + pgvector, Ollama (local LLM), sentence-transformers. 100% local & free.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors