Skip to content

Latest commit

 

History

History
23 lines (10 loc) · 1.1 KB

README.md

File metadata and controls

23 lines (10 loc) · 1.1 KB

RAG_Medical_Data

This repository demonstrates the implementation of an RAG pipeline using Llama-3-8B. It is part of a comparative study between fine-tuning and Retrieval-Augmented Generation (RAG) to determine which approach is more suitable for our use case.

The detailed blog can be found here.

RAG_medical.ipynb contains all the code necessary for setting up the RAG pipeline

Dataset

For this project, we will be using publicly available medical data. This dataset is structured as prompt-completion pairs, where users ask medical questions and receive relevant responses from doctors. (Data Source)

Overview of the pipeline: Screenshot 2024-07-25 at 3 44 35 PM

For questions or feedback about the project, don't hesitate to reach out to me on LinkedIn.

The fine-tuning implementation for this study can be found here.