feat(cli/evals): Generate metricSummaries #2768

ssbushi · 2025-04-15T20:33:26Z

Tooling part of #1642

Checklist (if applicable):

PR title is following https://www.conventionalcommits.org/en/v1.0.0/
Tested (manually, unit tested, etc.)

genkit-tools/common/src/eval/parser.ts

shrutip90 · 2025-04-16T20:49:28Z

genkit-tools/common/tests/eval/parser_test.ts

+        const booleanScores = reMapScores(simpleEvalOutput, (response, i) => ({
+          testCaseId: response.testCaseId,
+          evaluation: {
+            score: i % 2 === 0,


This is hard to read. I will have to look at simpleEvalOutput and do the math to understand what this should be. Please just define the test case here.

I was trying to avoid redefining the EvalResponse object for every test case since it is already a large object, I can add comments to the override function here to make the scores more obvious. LMK if that is helpful.

If not, I can definitely repeat the EvalResponse for each test case.

Thank you, this is better, but I would prefer having a different helper like I suggested below: "Maybe instead of reMapScores you can have a helper that takes an array of Score objects and generates Record<string, EvalResponse> with testcaseIds. That might be concise enough and easier to read."

Please check if we can do this.

genkit-tools/common/src/eval/parser.ts

shrutip90 · 2025-04-16T20:53:26Z

genkit-tools/common/tests/eval/parser_test.ts

+        const stringScores = reMapScores(simpleEvalOutput, (response, i) => ({
+          testCaseId: response.testCaseId,
+          evaluation: {
+            score: `TYPE_${i % 2}`,


ditto. Maybe instead of reMapScores you can have a helper that takes an array of Score objects and generates Record<string, EvalResponse> with testcaseIds. That might be concise enough and easier to read.

genkit-tools/common/src/eval/parser.ts

genkit-tools/common/tests/eval/parser_test.ts

pavelgj · 2025-04-19T13:25:56Z

nit: PR & commit messags title should be feat(cli) or feat(cli/evals)

genkit-tools/common/src/eval/parser.ts

ssbushi · 2025-04-22T14:15:21Z

nit: PR & commit messags title should be feat(cli) or feat(cli/evals)

This is not specific to the CLI, but okay.

shrutip90 · 2025-04-28T19:18:59Z

genkit-tools/common/tests/eval/parser_test.ts

+        const booleanScores = reMapScores(simpleEvalOutput, (response, i) => ({
+          testCaseId: response.testCaseId,
+          evaluation: {
+            score: i % 2 === 0,


Thank you, this is better, but I would prefer having a different helper like I suggested below: "Maybe instead of reMapScores you can have a helper that takes an array of Score objects and generates Record<string, EvalResponse> with testcaseIds. That might be concise enough and easier to read."

Please check if we can do this.

ssbushi added 4 commits April 15, 2025 16:03

feat: support metric summaries

36d884b

merge

030caf8

new lock

9b70f5b

fixes

9b87caf

github-project-automation bot added this to Genkit Backlog Apr 15, 2025

github-actions bot added js tooling config labels Apr 15, 2025

ssbushi requested review from shrutip90 and pavelgj April 15, 2025 20:39

fix types

5018132

shrutip90 reviewed Apr 16, 2025

View reviewed changes

feedback

17d7249

ssbushi requested a review from shrutip90 April 17, 2025 18:04

pavelgj reviewed Apr 19, 2025

View reviewed changes

genkit-tools/common/src/eval/parser.ts Outdated Show resolved Hide resolved

ssbushi changed the title ~~feat(evals): Generate metricSummaries~~ feat(cli/evals): Generate metricSummaries Apr 22, 2025

ssbushi added 2 commits April 22, 2025 10:57

remove lodash

795b193

Merge branch 'main' into sb/summaryMetrics

a5b26a7

ssbushi requested a review from pavelgj April 22, 2025 14:58

shrutip90 approved these changes Apr 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli/evals): Generate metricSummaries #2768

feat(cli/evals): Generate metricSummaries #2768

ssbushi commented Apr 15, 2025 •

edited

Loading

shrutip90 Apr 16, 2025

ssbushi Apr 17, 2025

shrutip90 Apr 28, 2025

shrutip90 Apr 16, 2025

pavelgj commented Apr 19, 2025

ssbushi commented Apr 22, 2025

shrutip90 Apr 28, 2025

feat(cli/evals): Generate metricSummaries #2768

Are you sure you want to change the base?

feat(cli/evals): Generate metricSummaries #2768

Conversation

ssbushi commented Apr 15, 2025 • edited Loading

shrutip90 Apr 16, 2025

Choose a reason for hiding this comment

ssbushi Apr 17, 2025

Choose a reason for hiding this comment

shrutip90 Apr 28, 2025

Choose a reason for hiding this comment

shrutip90 Apr 16, 2025

Choose a reason for hiding this comment

pavelgj commented Apr 19, 2025

ssbushi commented Apr 22, 2025

shrutip90 Apr 28, 2025

Choose a reason for hiding this comment

ssbushi commented Apr 15, 2025 •

edited

Loading