Skip to content

DRAMA Model Demo output not matching #20

@sachin-singh-12

Description

@sachin-singh-12

Hi Team, I am trying to test the DRAMA model sample code - link. However, I see mismatch between the expected output and actual output.

Sample code:

from transformers import AutoTokenizer, AutoModel


queries = [
    'What percentage of the Earth\'s atmosphere is oxygen?',
    '意大利首都是哪里?',
]
documents = [
    "The amount of oxygen in the atmosphere has fluctuated over the last 600 million years, reaching a peak of 35% during the Carboniferous period, significantly higher than today's 21%.",
    "羅馬是欧洲国家意大利首都和罗马首都广域市的首府及意大利全国的政治、经济、文化和交通中心,位于意大利半島中部的台伯河下游平原地,建城初期在七座小山丘上,故又名“七丘之城”。按城市范围内的人口计算,罗马是意大利人口最多的城市,也是欧盟人口第三多的城市。",
]


model_name = "facebook/drama-base"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name, trust_remote_code=True).to(device)

query_embs = model.encode_queries(tokenizer, queries)
doc_embs = model.encode_documents(tokenizer, documents)

scores = query_embs @ doc_embs.T
print(scores.tolist())
Expected output: [[0.5310, 0.0821], [0.1298, 0.6181]]

Actual output - [[0.4584735929965973, 0.24322254955768585], [0.12728893756866455, 0.5092089176177979]]

Colab Notebook link - https://colab.research.google.com/drive/1FkJMGEJBX7BGsoLeGiJdxCKnMBmMG19n?usp=sharing

What's causing this issue? I also tested the sample code with new values
queries = [
'iphone', 'cat food'
]

documents = [
'iphone 16 pro max',
'best cat food'
]

output - [[0.40802454948425293, 0.26841771602630615], [0.27385222911834717, 0.5687180757522583]]

Is this the correct behavior? The relevance seems quite poor

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions