This repository contains a Python script that demonstrates the power of reranking in two critical NLP applications:
- Evaluating and comparing outputs from multiple large language models
- Finding the most relevant documents from a collection based on a query
The tool uses Cohere's API as an implementation example, but the concepts can be applied with any reranking system.
Reranking is a powerful technique that goes beyond traditional search and retrieval methods. Unlike simple keyword matching or embedding similarity, reranking evaluates content based on deeper relevance to specific criteria.
Key advantages of reranking:
- Contextual understanding: Evaluates semantic relevance beyond keyword matching
- Multi-dimensional assessment: Considers multiple aspects of quality simultaneously
- Flexible criteria: Can be tailored to specific evaluation needs
- Comparative analysis: Provides relative quality scores across multiple options
Our script demonstrates two distinct reranking workflows:
- Provides a collection of documents on a related topic
- Defines a specific query with clear evaluation criteria
- Uses a reranking model to score each document's relevance
- Returns the top N most relevant documents with scores
- Sends the same prompt to multiple language models
- Collects each model's unique response
- Uses a reranking model to evaluate all responses against the query criteria
- Ranks the responses by relevance score, showing which model performed best
The repository includes an example query specifically designed for effective reranking:
Identify the most comprehensive explanation of solutions to climate change that includes both technological and policy approaches. The best response should discuss renewable energy transition, carbon capture technologies, international agreements, and address the challenges of implementation. Prioritize explanations that balance optimism with realistic assessment of challenges.
What makes this an effective reranking query:
- Specific criteria: Clearly defines what makes a response "good"
- Multiple dimensions: Evaluates across several aspects (comprehensiveness, balance, etc.)
- Detailed expectations: Specifies content elements that should be present
- Quality differentiation: Allows meaningful distinction between superficial and thorough responses
- Python 3.11+
- Cohere API key https://dashboard.cohere.com/
- Clone this repository
- Install dependencies:
pip install -r requirements.txt
- Create a
.envfrom the.env.examplewith the API key from https://dashboard.cohere.com/api-keysCOHERE_API_KEY=your_api_key_here
Run the script:
python rerankers.pyThe script will:
-
Perform document reranking demo:
- Rerank a collection of climate change documents
- Display the top N most relevant documents with scores
-
Perform model comparison demo:
- Query multiple models with the same prompt
- Display preview of each model's response
- Rerank all responses based on relevance to criteria
- Show the top N ranked responses with scores
Reranking can be applied in numerous real-world scenarios:
- Research assistance: Find the most relevant academic papers for a research question
- Legal discovery: Identify the most pertinent documents in a large collection
- Knowledge management: Surface the most helpful documentation for internal teams
- LLM benchmarking: Compare model performance on specific tasks
- Response quality assurance: Ensure AI outputs meet quality standards
- Model selection: Identify which model performs best for specific use cases
- Content recommendation: Surface the most relevant content for users
- Data filtering: Select highest quality examples for training datasets
- Summarization: Identify the most important passages to include in summaries
You can easily modify the script to:
- Change the document collection
- Adjust the reranking query
- Modify the number of top results returned
- Use different models for generation or reranking
- Apply different formatting for the results
This project is licensed under the MIT License - see the LICENSE file for details.