Skip to content

Conversation

@tonywu71
Copy link
Contributor

@tonywu71 tonywu71 commented Feb 6, 2025

Description

  • Add support for token pooling for VisionRetriever class
  • Add tests for the VisionRetriever class

Example usage:

from colpali_engine.compression import HierarchicalTokenPooler
from vidore_benchmark.retrievers import VisionRetriever

# Initialize your model and processor
# ...

# Create token pooler (reduces embedding size)
token_pooler = HierarchicalTokenPooler()

# Initialize retriever with token pooling
retriever = VisionRetriever(
    model=model,
    processor=processor,
    token_pooler=token_pooler,
    num_workers=2  # Optional: for faster processing
)

# Use as normal
query_embeddings = retriever.forward_queries(queries)
passage_embeddings = retriever.forward_passages(
    passages,
    pooling_kwargs={"pool_factor": 3},
)  # will use token pooling
scores = retriever.get_scores(query_embeddings, passage_embeddings)

Tests (local run)

Screenshot 2025-04-10 at 15 03 02

@tonywu71 tonywu71 added the enhancement New feature or request label Feb 6, 2025
@tonywu71 tonywu71 self-assigned this Feb 6, 2025
@tonywu71 tonywu71 force-pushed the add-support-for-token-pooling-for-vision-retriever-wrapper branch 2 times, most recently from b9cde16 to 2c81580 Compare February 13, 2025 09:26
@tonywu71 tonywu71 marked this pull request as draft February 13, 2025 12:23
@tonywu71 tonywu71 requested a review from Copilot April 10, 2025 12:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

tests/retrievers/test_vision_retriever.py:53

  • [nitpick] The type annotation 'ColIdefics3Retriever' is inconsistent with the fixture, which instantiates a VisionRetriever. Consider updating the type annotation to match the actual instance returned by the fixture.
def test_forward_queries(self, retriever: ColIdefics3Retriever, queries_fixture):

@tonywu71 tonywu71 marked this pull request as ready for review April 10, 2025 12:48
@tonywu71 tonywu71 force-pushed the add-support-for-token-pooling-for-vision-retriever-wrapper branch from 1a47dcf to f7e15eb Compare April 10, 2025 12:50
Copy link
Collaborator

@QuentinJGMace QuentinJGMace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good for me, sorry for the delay

@tonywu71 tonywu71 merged commit 0957511 into main Apr 25, 2025
5 checks passed
@tonywu71 tonywu71 deleted the add-support-for-token-pooling-for-vision-retriever-wrapper branch April 25, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants