Added basic text to sql RAG Framework code #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Step 1:
I developed a SQLite database using Python and created an Employee table containing essential fields like name, age, city, gender, total experience, and blood group.
Step 2:
I designed a CSV file (employee_questions.csv) where I documented natural language questions, their corresponding SQL queries, and short explanations which is also known as few short prompting.
This file serves as few-shot examples to help the model understand and generate correct queries.
Step 3:
I converted the questions and queries into embeddings using OpenAI's embedding model and stored them in FAISS — a high-speed vector database — for efficient retrieval.
Step 4:
Using LangChain, I implemented:
A FAISS retriever to search similar examples based on user input.
A Conversational Retrieval Chain that connects the retriever to a language model (LLM).
Step 5:
I integrated OpenAI GPT-3.5-turbo as the main language model.
However, the architecture is flexible and can easily switch to other models like Gemini, Llama, or any open-source LLM.
Step 6:
I built a Flask-based API where:
The user submits a question.
FAISS retrieves related examples.
A final prompt is crafted combining the database schema, examples, and the user’s question.
The model generates the corresponding SQL query.
The SQL query is cleaned and validated.
The query is executed against the employee.db database.
The API returns both the generated SQL query and its results.
Step 7:
In the future, I plan to enhance the system by:
Supporting multi-table joins and data modification operations (INSERT/UPDATE/DELETE).
Integrating conversation history, allowing the bot to understand the flow of previous interactions for better and more connected answers.
Replacing the LLM with open-source models to reduce costs and improve control.
✅ To check the chatbot outputs, you can refer to the result_img folder available in the repository, where I have attached screenshots of the working results.