TeknoMarket AI Customer Support Chatbot

A full-stack, locally-hosted AI customer support chatbot built with RAG (Retrieval-Augmented Generation). It answers customer questions by searching a vector knowledge base and generating natural language responses — all running on your machine with zero API costs.

Features

RAG Pipeline — Retrieves relevant knowledge chunks via vector similarity before generating answers, ensuring accurate and grounded responses
100% Local & Free — Uses Ollama with Gemma 3 4B for LLM inference and sentence-transformers for embeddings — no API keys or cloud services required
Vector Search — PostgreSQL + pgvector for fast cosine similarity search across 384-dimensional embeddings
Real-time Streaming — Server-Sent Events (SSE) deliver token-by-token responses to the UI as the LLM generates them
50-Item Knowledge Base — Pre-built Turkish FAQ covering 14 categories (shipping, returns, payments, account management, etc.)
Modern Chat UI — Full-page Angular 17 chat interface with typing indicators, source citations, quick-start questions, and responsive design
Multi-tenant Ready — tenant_id field supports multiple knowledge bases in a single database

Architecture

┌─────────────────┐     HTTP/SSE      ┌─────────────────────────────────────┐
│                 │  ───────────────►  │            FastAPI Backend          │
│   Angular 17    │                    │                                     │
│   Chat UI       │  ◄───────────────  │  ┌───────────┐   ┌──────────────┐  │
│   (port 4200)   │    SSE stream      │  │  RAG       │   │  Ollama      │  │
│                 │                    │  │  Service   │──►│  (Gemma 3)   │  │
└─────────────────┘                    │  └─────┬─────┘   └──────────────┘  │
                                       │        │                            │
                                       │  ┌─────▼─────────────────────────┐  │
                                       │  │  sentence-transformers        │  │
                                       │  │  (all-MiniLM-L6-v2, 384d)    │  │
                                       │  └─────┬────────────────────────┘  │
                                       │        │                            │
                                       │  ┌─────▼─────┐                     │
                                       │  │ PostgreSQL │                     │
                                       │  │ + pgvector │                     │
                                       │  └───────────┘                     │
                                       └─────────────────────────────────────┘

How a query flows:

User sends a message from the Angular chat UI
Backend embeds the query using sentence-transformers (384-dim vector)
pgvector performs cosine similarity search against the knowledge base
Top-5 most relevant chunks are retrieved as context
Context + question are sent to Ollama (Gemma 3 4B) with a system prompt
Response streams back token-by-token via SSE to the frontend

Tech Stack

Layer	Technology	Purpose
Frontend	Angular 17 (standalone components)	Chat UI with SSE streaming
Backend	FastAPI + Python 3.11	Async REST API + SSE endpoints
LLM	Ollama (Gemma 3 4B)	Local language model inference
Embeddings	sentence-transformers (all-MiniLM-L6-v2)	Local 384-dim text embeddings
Database	PostgreSQL 16 + pgvector	Vector similarity search
Infra	Docker Compose	Database container orchestration

Project Structure

ChatBotAI/
├── docker-compose.yml           # PostgreSQL + pgvector container
├── knowledge-base.json          # 50-item Turkish FAQ knowledge base
│
├── backend/
│   ├── Dockerfile               # Python API container
│   ├── requirements.txt         # Python dependencies
│   ├── seed_knowledge.py        # Seeds knowledge base into pgvector
│   ├── .env.example             # Environment variables template
│   └── app/
│       ├── main.py              # FastAPI app, CORS, lifespan
│       ├── config.py            # Pydantic settings from .env
│       ├── database.py          # Async SQLAlchemy + pgvector setup
│       ├── models/
│       │   └── schemas.py       # Request/response Pydantic models
│       ├── api/
│       │   ├── chat.py          # POST /api/chat, POST /api/chat/stream
│       │   └── knowledge.py     # CRUD endpoints for knowledge base
│       └── services/
│           ├── embedding_service.py   # Singleton sentence-transformer
│           ├── rag_service.py         # Vector search + context builder
│           └── claude_service.py      # Ollama LLM client (OpenAI-compatible)
│
└── frontend/
    ├── angular.json             # Angular CLI config with proxy
    ├── proxy.conf.json          # Dev proxy → backend:8000
    └── src/app/
        ├── app.component.ts     # Root component
        ├── models/
        │   └── chat.models.ts   # TypeScript interfaces
        ├── services/
        │   └── chat.service.ts  # HTTP + SSE streaming client
        └── components/
            └── chat-widget/     # Full-page chat component
                ├── chat-widget.component.ts
                ├── chat-widget.component.html
                └── chat-widget.component.scss

Prerequisites

Docker Desktop — for PostgreSQL + pgvector
Python 3.11+ — for the backend
Node.js 18+ and Angular CLI 17+ — for the frontend
Ollama — for local LLM inference

Getting Started

1. Clone the repository

git clone https://github.com/ahmetgkdemr/ChatBotAI.git
cd ChatBotAI

2. Start PostgreSQL with pgvector

docker compose up db -d

Wait for the health check to pass (~5 seconds).

3. Install and run Ollama

Download Ollama and pull the model:

ollama pull gemma3:4b

Make sure Ollama is running (it starts automatically after installation).

4. Set up the backend

cd backend
pip install -r requirements.txt
cp .env.example .env

Seed the knowledge base into pgvector:

python -X utf8 seed_knowledge.py

Start the API server:

uvicorn app.main:app --reload --port 8000

5. Set up the frontend

cd frontend
npm install
ng serve

6. Open the app

Navigate to http://localhost:4200 — the chat interface will be ready.

API Endpoints

Method	Endpoint	Description
`GET`	`/api/health`	Health check
`POST`	`/api/chat`	Synchronous chat response
`POST`	`/api/chat/stream`	SSE streaming chat response
`GET`	`/api/knowledge`	List knowledge base entries
`GET`	`/api/knowledge/stats`	Category distribution stats
`POST`	`/api/knowledge`	Add new knowledge entry
`DELETE`	`/api/knowledge/{id}`	Delete knowledge entry

Example: Chat Request

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "How can I track my order?", "tenant_id": "demo"}'

Example: Response

{
  "response": "You can track your order from 'My Account > My Orders' page...",
  "sources": [
    {
      "question": "How do I track my order?",
      "category": "Order Tracking",
      "score": 0.8923
    }
  ]
}

Knowledge Base

The knowledge base (knowledge-base.json) contains 50 Turkish FAQ entries across 14 categories:

Category	Count	Topics
Account Management	5	Registration, password reset, 2FA, profile
Shipping	5	Tracking, delivery times, free shipping
Returns & Refunds	5	Return policy, refund process, conditions
Payment Methods	5	Credit card, bank transfer, installments
Order Management	4	Cancellation, modification, bulk orders
Product Info	3	Warranty, specs, authenticity
Campaigns & Discounts	3	Promo codes, loyalty program
Technical Support	3	App issues, notifications
Privacy & Security	3	Data protection, KVKK compliance
Corporate Sales	3	B2B, invoicing
Gift Cards	2	Purchase, redemption
Marketplace	3	Third-party sellers
Accessibility	2	Accessibility features
Sustainability	2	Eco-friendly packaging

Configuration

All configuration is managed through environment variables in backend/.env:

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434/v1`	Ollama API endpoint (OpenAI-compatible)
`OLLAMA_MODEL`	`gemma3:4b`	LLM model name in Ollama
`DATABASE_URL`	`postgresql+asyncpg://...`	PostgreSQL connection string
`EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	sentence-transformers model name

Using a different LLM

You can swap the LLM by changing OLLAMA_MODEL in .env to any Ollama-supported model:

# Larger, higher quality
OLLAMA_MODEL=llama3.1:8b

# Smaller, faster
OLLAMA_MODEL=gemma3:1b

# Multilingual
OLLAMA_MODEL=qwen2.5:7b

Screenshots

Chat Interface

The full-page chat interface features:

Welcome screen with quick-start question buttons
User messages (right, blue) and AI responses (left, gray)
Real-time streaming with typing indicator
Source category badges on each response
Responsive layout (mobile full-screen, desktop centered card)

License

This project is for educational and demonstration purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
knowledge-base.json		knowledge-base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeknoMarket AI Customer Support Chatbot

Features

Architecture

Tech Stack

Project Structure

Prerequisites

Getting Started

1. Clone the repository

2. Start PostgreSQL with pgvector

3. Install and run Ollama

4. Set up the backend

5. Set up the frontend

6. Open the app

API Endpoints

Example: Chat Request

Example: Response

Knowledge Base

Configuration

Using a different LLM

Screenshots

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TeknoMarket AI Customer Support Chatbot

Features

Architecture

Tech Stack

Project Structure

Prerequisites

Getting Started

1. Clone the repository

2. Start PostgreSQL with pgvector

3. Install and run Ollama

4. Set up the backend

5. Set up the frontend

6. Open the app

API Endpoints

Example: Chat Request

Example: Response

Knowledge Base

Configuration

Using a different LLM

Screenshots

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages