feat: make general purpose metrics more general #1666

jjmachan · 2024-11-13T06:10:45Z

Metrics Converted

Aspect Critic
Simple Criteria
Rubric Based - both Instance and Domain specific

a few different examples

Aspect Critic

from ragas.metrics import AspectCritic
from ragas.dataset_schema import SingleTurnSample

only_response = SingleTurnSample(
    response="The Eiffel Tower is located in Paris."
)

grammar_critic = AspectCritic(
    name="grammar",
    definition="Is the response grammatically correct?",
    llm=evaluator_llm
)

await grammar_critic.single_turn_ascore(only_response)

with reference

answer_correctness_critic = AspectCritic(
    name="answer_correctness",
    definition="Is the response and reference answer are the same?",
    llm=evaluator_llm
)

# data row
sample = SingleTurnSample(
    user_input="Where is the Eiffel Tower located?",
    response="The Eiffel Tower is located in Paris.",
    reference="London"
)
await answer_correctness_critic.single_turn_ascore(sample)

Note: this only works for multi-turn metrics for now

shahules786 · 2024-11-14T08:20:40Z

Let's say I am evaluating using an evaluation dataset that contains user_input, response, reference, and retrieved_context. Consider scenarios of the user using two metrics with this dataset, ie context_recall and aspect critic (as harmfulness or something) at the same time. Using this interface, even if my aspect critic metric does not need retrieved context it will use it. There is no way to opt-out. I think this can occur when using the metrics using the evaluate interface.
When using a single metric this seems to be fine.
@jjmachan

jjmachan · 2024-11-14T09:48:00Z

src/ragas/metrics/_aspect_critic.py

            MetricType.SINGLE_TURN: {
-                "user_input",
-                "response",
+                "user_input:optional",
+                "response:optional",
                "retrieved_contexts:optional",
+                "reference:optional",
+                "reference_contexts:optional",
            },


you can change it here - user control

shahules786

Reflect changes in docs.

feat: make aspect_critic more general

c28ac8d

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 13, 2024

feat: make rubric generalisable

b6e35c2

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 13, 2024

jjmachan added 3 commits November 13, 2024 12:59

fix: import error and more examples

8d353cd

feat: made simple criteria generalisable

f50eb5c

fmt: format changes

5eed4d9

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Nov 13, 2024

feat: add more examples

7c76915

jjmachan requested a review from shahules786 November 13, 2024 11:00

jjmachan commented Nov 14, 2024

View reviewed changes

shahules786 requested changes Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make general purpose metrics more general #1666

feat: make general purpose metrics more general #1666

jjmachan commented Nov 13, 2024 •

edited

Loading

shahules786 commented Nov 14, 2024 •

edited

Loading

jjmachan Nov 14, 2024

shahules786 left a comment

feat: make general purpose metrics more general #1666

Are you sure you want to change the base?

feat: make general purpose metrics more general #1666

Conversation

jjmachan commented Nov 13, 2024 • edited Loading

Metrics Converted

Aspect Critic

shahules786 commented Nov 14, 2024 • edited Loading

jjmachan Nov 14, 2024

Choose a reason for hiding this comment

shahules786 left a comment

Choose a reason for hiding this comment

jjmachan commented Nov 13, 2024 •

edited

Loading

shahules786 commented Nov 14, 2024 •

edited

Loading