Uh oh!

There was an error while loading. Please reload this page.

confident-ai / deepeval Public

Notifications You must be signed in to change notification settings
Fork 1.6k
Star 16.5k

Code
Issues 213
Pull requests 107
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: confident-ai/deepeval

Labels 17 Milestones 1

New pull request New

107 Open 1,974 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: correct dataset field defaults, add source-grouping to ContextualRecall, short-circuit Summarization identity

#2806 opened Jun 26, 2026 by medhwu

Loading…

docs: note logprob-less gpt-5 models are unsupported for G-Eval (#2280)

#2805 opened Jun 26, 2026 by minh2416294

Loading…

docs(tool-correctness): fix nonexistent ToolCallParams.TOOL reference (#1510)

#2804 opened Jun 26, 2026 by minh2416294

Loading…

fix(metrics): return score 1.0 when SummarizationMetric input equals output

#2803 opened Jun 25, 2026 by mimran-khan

Loading…

feat(benchmarks): add batch_generate support to SQuAD benchmark

#2802 opened Jun 25, 2026 by GouravSingal-code

Loading…

Add chatbot evaluation example awaiting code fix

#2801 opened Jun 25, 2026 by sunnyg-sdet

Loading…

Add TS support for synthetic data, simulation parity, and prompt optimization

#2798 opened Jun 24, 2026 by theanuragg

Loading…

fix(dag): make DAGMetric instances safely reusable across measure() calls

#2795 opened Jun 24, 2026 by jhochenbaum • Draft

2 of 3 tasks

Fix HallucinationMetric silently passing when context is empty

#2794 opened Jun 23, 2026 by k-dickinson

Loading…

feat(synthesizer): add optional expected_output_schema for structured goldens

#2793 opened Jun 21, 2026 by bonpiedlaroute Contributor

Loading…

fix(dataset): correct invalid field defaults and except clause

#2792 opened Jun 20, 2026 by bongho Contributor

Loading…

feat(metrics): support document-type specific threshold overrides

#2791 opened Jun 20, 2026 by rohitmannur007

Loading…

docs(examples): add heterogeneous financial document RAG evaluation example with threshold_overrides

#2790 opened Jun 20, 2026 by Ruthwik-Data Contributor • Draft

1 task done

test(metrics): add overlapping-chunk regression fixtures for ContextualRecallMetric (closes #2788)

#2789 opened Jun 20, 2026 by Ruthwik-Data Contributor • Draft

1 task done

test(metrics): add overlapping-chunk regression fixtures for ContextualPrecisionMetric (rebased on #2743)

#2787 opened Jun 20, 2026 by Ruthwik-Data Contributor • Draft

1 task done

fix(execute): resolve #2216 — AsyncConfig import problem

#2786 opened Jun 20, 2026 by Sudhanwa-git

Loading…

3 of 4 tasks

feat: add AgentLoopDetectionMetric scaffold

#2782 opened Jun 19, 2026 by rohitmannur007

Loading…

fix(llms): prevent trim_and_load_json from corrupting valid JSON strings

#2780 opened Jun 18, 2026 by cschanhniem

Loading…

4 tasks done

fix(tracing): make trace API payloads JSON-safe

#2773 opened Jun 17, 2026 by 2830500285

Loading…

fix: use is not None for shots_dataset checks in GSM8K and HellaSwag benchmarks

#2772 opened Jun 17, 2026 by ashishlandiwal

Loading…

fix: trim_and_load_json in models/llms corrupts valid JSON string values

#2771 opened Jun 16, 2026 by Hiyaarora

Loading…

fix: add re.DOTALL so RetrievedContextData with newlines survives dataset round-trip

#2769 opened Jun 16, 2026 by JSap0914

Loading…

Add retail support evaluation example with synthetic dataset

#2767 opened Jun 15, 2026 by kadiryonak

Loading…

docs: document chunk_size and min_contexts tuning for short documents in Synthesizer

#2765 opened Jun 15, 2026 by Goutham-Annem

Loading…

docs: add generate_verdicts custom template example for AnswerRelevancyMetric

#2764 opened Jun 15, 2026 by Goutham-Annem

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!