RAG_Medical_Data

This repository demonstrates the implementation of an RAG pipeline using Llama-3-8B. It is part of a comparative study between fine-tuning and Retrieval-Augmented Generation (RAG) to determine which approach is more suitable for our use case.

The detailed blog can be found here.

RAG_medical.ipynb contains all the code necessary for setting up the RAG pipeline

Dataset

For this project, we will be using publicly available medical data. This dataset is structured as prompt-completion pairs, where users ask medical questions and receive relevant responses from doctors. (Data Source)

Overview of the pipeline:

For questions or feedback about the project, don't hesitate to reach out to me on LinkedIn.

The fine-tuning implementation for this study can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RAG_Medical_Data

Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

RAG_Medical_Data

Dataset