AIR-RAG is an adaptive, iterative retrieval framework that enhances Retrieval-Augmented Generation (RAG) by optimizing both retrieval relevance and LLM alignment. AIR-RAG eliminates the need for complex retraining pipelines while delivering superior performance across multiple benchmark datasets.
- Adaptive Iterative Retrieval: Two-iteration process that progressively refines retrieval results
- Dual Alignment: Optimizes both retriever and LLM preferences simultaneously
- Weighted KTO Training: Uses Kahneman-Tversky Optimization for preference learning
- Clone the repository:
git clone https://github.com/aialt/AIR-RAG.git
cd AIR-RAG- Install environment using pixi:
# Make sure pixi is installed, then install the environment
pixi install- Install additional dependencies:
python -m spacy download en_core_web_sm
python -m nltk.downloader punktDownload the generator model Llama3-8B-baseline, ColBERT index and datasets from https://github.com/fate-ubw/RAGLAB.
cd AIR-RAG
mkdir -p model/{colbertv2.0,contriever-msmarco,Llama3-8B-baseline,selfrag_llama3_8b-epoch_0_1}
# Retriever models
huggingface-cli download colbert-ir/colbertv2.0 --local-dir model/colbertv2.0/ --local-dir-use-symlinks False
huggingface-cli download facebook/contriever-msmarco --local-dir model/contriever-msmarco/ --local-dir-use-symlinks False
# Generator models (8B)
huggingface-cli download RAGLAB/Llama3-8B-baseline --local-dir model/Llama3-8B-baseline/ --local-dir-use-symlinks False
huggingface-cli download RAGLAB/selfrag_llama3-8B --local-dir model/selfrag_llama3_8b-epoch_0_1/ --local-dir-use-symlinks FalseDownload complete datasets:
cd AIR-RAG
huggingface-cli download RAGLAB/data --local-dir data --repo-type datasetSet your OpenAI API key in api_keys.txt.
Start ColBERT server:
sh run/colbert_server_wiki2018.sh
Raise the generator LLM with vLLM server:
sh run/generator_vllm.sh
Run preference data collection:
pixi run python labeller.py
cat data/labelled_training_data/triviaqa-labelled.jsonl > data/labelled_training_data/train_all.jsonl
cat data/labelled_training_data/hotpot-labelled.jsonl >> data/labelled_training_data/train_all.jsonl
pixi run python process_all_train_jsonl.py
The base model is meta-llama/Llama-3.1-8B-Instruct.
sh run/run_kto_llama.sh
Start up the adapter model:
sh run/gasket_vllm.sh
Also start the ColBERT server and the generator LLM.
Run the evaluation:
sh run/run_exp.sh
To run other baseline algorithms, you can refer to the run/baseline directory.
For example, to run all Active RAG configurations with GPT-3.5:
sh run/baseline/run_active_rag_gpt35.sh
Note: You need to setup full ColBERT server (requires 60GB+ RAM):
# Update paths in config files
# Edit config/colbert_server/colbert_server.yaml
index_dbPath: {your_root_path}/AIR-RAG/data/retrieval/colbertv2.0_embedding/wiki2018
text_dbPath: {your_root_path}/AIR-RAG/data/retrieval/colbertv2.0_passages/wiki2018/wiki2018.tsv
# Start server
sh run/colbert_server/colbert_server.shAIR-RAG demonstrates superior performance across six benchmark datasets spanning three task categories:
Open-Domain QA: TriviaQA, PopQA
Multi-hop QA: HotpotQA, WikiMultiHopQA
Fact-checking: PubHealth, StrategyQA
Our code development is based on RAGLAB [arXiv:2408.11381]. We would like to thank them for their excellent work and for providing the code framework and baseline implementations that made this research possible. Their contributions to the RAG community have been invaluable.
