📄 RAG File QA Chatbot

Chat with your PDFs using LangChain + Streamlit + OpenAI

📌 Overview

This project is a Retrieval-Augmented Generation (RAG) chatbot built with Streamlit + LangChain + OpenAI.
It allows you to upload PDF files, query them in natural language, and get AI-powered answers with sources.

📄 Upload multiple PDFs
✂️ Split text into smart chunks
🧠 Generate embeddings with OpenAI
💾 Store + retrieve chunks using ChromaDB
🤖 Ask questions & get contextual answers with sources shown

🚀 Recent Improvements & Fixes

We’ve made several significant upgrades and fixes to evolve this chatbot from a simple prototype to a robust, production-ready conversational application.

🧠 1. Added Conversational Memory

Before: Each query was treated independently. Follow-ups like “why?” made no sense.
After: Implemented a history-aware RAG chain that reformulates follow-ups into full contextual questions (e.g., “why is the sky blue?”).
➤ The chatbot now maintains multi-turn memory, enabling natural and contextually aware conversations.

⚡ 2. Modernized Streaming Engine for Streamlit

Before: Used outdated callbacks (StreamHandler) that caused NoSessionContext errors.
After: Replaced with LangChain’s modern .stream() method.
➤ This allows real-time response streaming directly in Streamlit without threading issues.

🧰 3. Fixed Code-Level Bugs

🔄 Circular Import Fixed: Renamed local langchain.py → app.py to avoid module conflicts.
🔕 Disabled ChromaDB telemetry logs for cleaner terminal output.

In summary, your chatbot is now stable, conversational, and fully interactive, ready for production use.

✨ Features

📂 Multi-PDF Support — Upload and query multiple documents
🧩 Chunking & Embedding — Splits content for better context retrieval
🔍 RAG Pipeline — Retrieval + Context-aware AI answers
🧠 Conversational Memory — Handles follow-up questions seamlessly
💬 Real-time Streaming — Smooth token-by-token response in Streamlit
📊 Source Transparency — Displays top 3 document sources
⚡ Streamlit UI — Simple and interactive interface

🛠️ Tech Stack

Layer	Technologies	Purpose
Frontend	Streamlit	Interactive UI
Backend	Python, LangChain	RAG pipeline & orchestration
Vector DB	ChromaDB	Store & retrieve embeddings
Document Loader	PyMuPDF	Parse PDF files
LLM + Embeddings	OpenAI (GPT + embeddings)	Contextual QA

⚙️ How It Works (RAG Pipeline)

📥 Upload PDFs — User uploads documents via Streamlit UI
✂️ Text Splitting — Documents are chunked into smaller passages
🔑 Embedding — Each chunk is embedded using OpenAI embeddings
💾 Vector Store — Chunks + embeddings stored in ChromaDB
❓ Query — User asks a question
🔍 Retriever — Relevant chunks are retrieved
🤖 LLM Response — GPT answers using retrieved context
📑 Sources — Top 3 supporting chunks shown

🧪 Local Development

🔧 Requirements

Python 3.9+
OpenAI API Key

🏁 Getting Started

1. Clone & Setup

git clone https://github.com/your-username/rag-file-chatbot.git
cd rag-file-chatbot

2. Install Dependencies

pip install -r requirements.txt

3. Add API Key

Create a .env file (see .env.example):

OPENAI_API_KEY=your_openai_key

🚦 Run the App

streamlit run app.py

The app will run locally at 👉 http://localhost:8501

📁 Folder Structure

rag-file-chatbot/
├── app.py              # Main Streamlit app
├── requirements.txt    # Python dependencies
├── .env.example        # Example API keys
├── .gitignore
└── README.md

🙌 Acknowledgments

Built with ❤️ by Kartik Garg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📄 RAG File QA Chatbot

Chat with your PDFs using LangChain + Streamlit + OpenAI

📌 Overview

🚀 Recent Improvements & Fixes

🧠 1. Added Conversational Memory

⚡ 2. Modernized Streaming Engine for Streamlit

🧰 3. Fixed Code-Level Bugs

✨ Features

🛠️ Tech Stack

⚙️ How It Works (RAG Pipeline)

🧪 Local Development

🔧 Requirements

🏁 Getting Started

1. Clone & Setup

2. Install Dependencies

3. Add API Key

🚦 Run the App

📁 Folder Structure

🙌 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

kartik0905/ask-your-pdf

Folders and files

Latest commit

History

Repository files navigation

📄 RAG File QA Chatbot

Chat with your PDFs using LangChain + Streamlit + OpenAI

📌 Overview

🚀 Recent Improvements & Fixes

🧠 1. Added Conversational Memory

⚡ 2. Modernized Streaming Engine for Streamlit

🧰 3. Fixed Code-Level Bugs

✨ Features

🛠️ Tech Stack

⚙️ How It Works (RAG Pipeline)

🧪 Local Development

🔧 Requirements

🏁 Getting Started

1. Clone & Setup

2. Install Dependencies

3. Add API Key

🚦 Run the App

📁 Folder Structure

🙌 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages