-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Pull requests: confident-ai/deepeval
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: correct dataset field defaults, add source-grouping to ContextualRecall, short-circuit Summarization identity
#2806
opened Jun 26, 2026 by
medhwu
Loading…
docs: note logprob-less gpt-5 models are unsupported for G-Eval (#2280)
#2805
opened Jun 26, 2026 by
minh2416294
Loading…
docs(tool-correctness): fix nonexistent ToolCallParams.TOOL reference (#1510)
#2804
opened Jun 26, 2026 by
minh2416294
Loading…
fix(metrics): return score 1.0 when SummarizationMetric input equals output
#2803
opened Jun 25, 2026 by
mimran-khan
Loading…
feat(benchmarks): add batch_generate support to SQuAD benchmark
#2802
opened Jun 25, 2026 by
GouravSingal-code
Loading…
Add TS support for synthetic data, simulation parity, and prompt optimization
#2798
opened Jun 24, 2026 by
theanuragg
Loading…
fix(dag): make DAGMetric instances safely reusable across measure() calls
#2795
opened Jun 24, 2026 by
jhochenbaum
•
Draft
2 of 3 tasks
Fix HallucinationMetric silently passing when context is empty
#2794
opened Jun 23, 2026 by
k-dickinson
Loading…
feat(synthesizer): add optional expected_output_schema for structured goldens
#2793
opened Jun 21, 2026 by
bonpiedlaroute
Contributor
Loading…
fix(dataset): correct invalid field defaults and except clause
#2792
opened Jun 20, 2026 by
bongho
Contributor
Loading…
feat(metrics): support document-type specific threshold overrides
#2791
opened Jun 20, 2026 by
rohitmannur007
Loading…
docs(examples): add heterogeneous financial document RAG evaluation example with threshold_overrides
#2790
opened Jun 20, 2026 by
Ruthwik-Data
Contributor
•
Draft
1 task done
test(metrics): add overlapping-chunk regression fixtures for ContextualRecallMetric (closes #2788)
#2789
opened Jun 20, 2026 by
Ruthwik-Data
Contributor
•
Draft
1 task done
test(metrics): add overlapping-chunk regression fixtures for ContextualPrecisionMetric (rebased on #2743)
#2787
opened Jun 20, 2026 by
Ruthwik-Data
Contributor
•
Draft
1 task done
fix(execute): resolve #2216 — AsyncConfig import problem
#2786
opened Jun 20, 2026 by
Sudhanwa-git
Loading…
3 of 4 tasks
fix(llms): prevent trim_and_load_json from corrupting valid JSON strings
#2780
opened Jun 18, 2026 by
cschanhniem
Loading…
4 tasks done
fix: use
is not None for shots_dataset checks in GSM8K and HellaSwag benchmarks
#2772
opened Jun 17, 2026 by
ashishlandiwal
Loading…
fix: trim_and_load_json in models/llms corrupts valid JSON string values
#2771
opened Jun 16, 2026 by
Hiyaarora
Loading…
fix: add re.DOTALL so RetrievedContextData with newlines survives dataset round-trip
#2769
opened Jun 16, 2026 by
JSap0914
Loading…
Add retail support evaluation example with synthetic dataset
#2767
opened Jun 15, 2026 by
kadiryonak
Loading…
docs: document chunk_size and min_contexts tuning for short documents in Synthesizer
#2765
opened Jun 15, 2026 by
Goutham-Annem
Loading…
docs: add generate_verdicts custom template example for AnswerRelevancyMetric
#2764
opened Jun 15, 2026 by
Goutham-Annem
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.