Releases · vibrantlabsai/ragas

13 Jan 17:47

anistark

v0.4.3

4ecab38

v0.4.3 Latest

Latest

What's Changed

feat: add DSPyOptimizer with MIPROv2 for advanced prompt optimization by @anistark in #2537
feat(docs): add llms.txt generation for LLM-friendly documentation by @sanjeed5 in #2539
feat: dspy caching by @anistark in #2542
feat: add system prompt support for InstructorLLM and LiteLLMStructuredLLM by @anistark in #2543
feat(docs): add copy-to-llm button for easy AI tool integration by @sanjeed5 in #2541
fix: use PAT token for docs-check CI as docs-apply CI by @anistark in #2546
feat: add remaining quickstart templates by @anistark in #2547
fix: enable FactualCorrectness language adaptation by @anistark in #2555
fix: resolve DiskCacheBackend pickling issue with InstructorLLM by @anistark in #2556
fix: lazy init DEFAULT_TOKENIZER to avoid network calls at import time. by @cgaswin in #2545
fix: comment on failed task by @anistark in #2557
docs: fix DiscreteMetric llm examples to match API by @cgaswin in #2558
fix: add repository parameter to checkout action for fork PR support by @anistark in #2559

Full Changelog: v0.4.2...v0.4.3

Contributors

anistark, sanjeed5, and cgaswin

Assets 2

23 Dec 17:13

anistark

v0.4.2

bf61178

v0.4.2

What's Changed

feat: migrate SQLSemanticEquivalence to collections API by @anistark in #2496
feat: migrate DataCompyScore to collections API by @anistark in #2499
fix: migrate CHRF Score to new metrics collections by @anistark in #2500
Feat/improve rag quickstart by @anistark in #2501
fix: handle classification array length mismatch in TopicAdherence metric by @anistark in #2503
feat: migrate quoted spans metric to collections api by @anistark in #2504
fix: handle instructor modes for json and tools by @anistark in #2505
docs: remove obsolete 'write your own metric' guides by @sanjeed5 in #2488
fix: BasePrompt.adapt() structured output + language adaptation guide by @sanjeed5 in #2487
Add AG-UI Protocol Integration for Agent Evaluation by @contextablemark in #2395
docs: fix modifying-prompts-metrics guide with correct API by @sanjeed5 in #2486
fix: handle nested dicts/lists in ToolCallF1 args to prevent unhashable type error by @anistark in #2507
feat: add Claude docs auto-update workflow by @sanjeed5 in #2508
fix: allow unannotated parameters to accept any type by @dhyaneesh in #2513
feat: migrate MultiModalFaithfulness to collections API by @anistark in #2515
feat: migrate MultiModalRelevance to collections API by @anistark in #2518
feat: add support for new google-genai SDK with backwards compatibility for deprecated google-generativeai by @anistark in #2517
fix: use instructor Mode.JSON for litellm and generic providers to fix Dict type validation errors by @anistark in #2514
[AG-UI] Replacing "OpenAI" with "AsyncOpenAI". by @contextablemark in #2520
Feat: document PDF export workflow by @cgaswin in #2522
refactor: deprecate legacy metrics by @anistark in #2519
fix: claude workflows to use github token by @anistark in #2527
feat: add HuggingFace tokenizer support in knowledge graph operations by @anistark in #2524
feat: add generate_with_chunks for pre-chunked documents by @MinseongS in #2526
feat(docs): add offline mermaid support to PDF export by @cgaswin in #2530
fix: claude workflows to use pat token instead of github token to support forked PRs by @anistark in #2531
docs: change query execution to use asyncio.run by @yangzq50 in #2532
fix: increase max-turns and update prompt by @anistark in #2534
chore: remove survey link by @anistark in #2535
feat: add caching support for metrics collections by @anistark in #2533
feat: add caching support for embeddings by @anistark in #2536

New Contributors

@contextablemark made their first contribution in #2395
@cgaswin made their first contribution in #2522
@MinseongS made their first contribution in #2526
@yangzq50 made their first contribution in #2532

Full Changelog: v0.4.1...v0.4.2

Contributors

anistark, sanjeed5, and 5 other contributors

Assets 2

10 Dec 16:28

anistark

v0.4.1

4308b24

v0.4.1

What's Changed

feat: add save/load methods to BasePrompt by @anistark in #2465
docs: update run_config guide to use collections API by @sanjeed5 in #2468
fix: add anthropic and gemini clients for custom clients by @anistark in #2472
feat: migrate ToolCallAccuracy to collections API by @anistark in #2476
chore: add survey link to readme and docs banner by @anistark in #2478
Fix: apply_transforms(kg, transforms, run_config=run_config or RunCon… by @narabi in #2480
feat: migrate ToolCallF1 to collections API with set-based F1 scoring by @anistark in #2483
New input parameter for the TestsetGenerator Class : LLM_CONTEXT by @narabi in #2474
feat: migrate TopicAdherence and AgentGoalAccuracy to collections API by @anistark in #2485
docs: organize integrations nav into collapsible groups by @sanjeed5 in #2492
aembed_text() replace embed_text() in AnswerRelevancy by @anexplore in #2495
docs: remove broken reference to train_your_own_metric.md by @sanjeed5 in #2489
feat: migrate Rubrics Metrics to collections API by @anistark in #2494

New Contributors

@narabi made their first contribution in #2480
@anexplore made their first contribution in #2495

Full Changelog: v0.4.0...v0.4.1

Contributors

anexplore, anistark, and 2 other contributors

Assets 2

03 Dec 16:22

anistark

v0.4.0

877ed6c

v0.4.0

What's Changed

docs: complete collections API documentation for remaining metrics by @sanjeed5 in #2420
feat: support GPT-5 and o-series models with automatic temperature and top_p constraint handling by @anistark in #2418
update: add llm options as tabs to quickstart by @anistark in #2421
feat: migrate to instructor.from_provider for universal provider support by @anistark in #2424
docs: fix typos in some files by @Edge-Seven in #2429
feat: implement prompt class for context precision by @anistark in #2433
fix: docs for quickstart by @anistark in #2434
Fix 2432: Update import statements for langchain modules by @rvernica in #2436
Fix 2440 Adjust LLM parameters in evals.py by @rvernica in #2441
feat: migrate context recall, answer relevancy, and context entity recall metrics to modular BasePrompt architecture by @anistark in #2435
feat: migrate 6 metrics (ContextRelevance, Response Groundedness, AnswerAccuracy, Faithfulness, AnswerCorrectness, SummaryScore) to BasePrompt by @anistark in #2443
feat: migrate final metrics (FactualCorrectness, NoiseSensitivity) to modular BasePrompt architecture and update docs by @anistark in #2444
chore: add COC by @anistark in #2437
docs: clarify MLflow is required, not optional in RAG evaluation guide by @anistark in #2447
chore: cleanup old patterns and update links by @anistark in #2449
chore: rebranding efforts by @jjmachan in #2445
feat: dual adapter support (Instructor + LiteLLM) by @anistark in #2446
fix: resolve InstructorLLM detection bug and add EvaluationDataset backend support for experiments by @anistark in #2451
fix: retrieved_contexts string filtering in LangChain integration by @dhyaneesh in #2452
fix: correct MultiTurnSample user_input validation logic by @harshil-sanghvi in #2426
fix: automatic embedding provider matching for LLMs by @anistark in #2454
fix: make GitPython an optional dependency by @anistark in #2453
docs: Update customizations how-to guides to use collections API and LLM factory by @sanjeed5 in #2425
fix: detect async clients in closures for instructor-wrapped litellm routers by @anistark in #2458
fix: quickstart by @anistark in #2463
chore: update calendar email by @anistark in #2462
fix: make GoogleEmbeddings handle GenerativeModel clients by auto-extracting genai module by @anistark in #2466
docs: add migration guide for v0.4 by @anistark in #2461

New Contributors

@Edge-Seven made their first contribution in #2429
@rvernica made their first contribution in #2436
@dhyaneesh made their first contribution in #2452

Full Changelog: v0.3.9...v0.4.0

Contributors

rvernica, jjmachan, and 5 other contributors

Assets 2

11 Nov 17:24

anistark

v0.3.9

8b4653c

v0.3.9

What's Changed

fix(docs): add missing line break so the step title and description a… by @nkch1k in #2391
Migrate SummaryScore by @rhlbhatnagar in #2376
feat: add metadata fields for synthetic data traceability by @dev-jonathan in #2389
Migrate noise sensitivity by @rhlbhatnagar in #2379
docs: quickstart guide with interactive LLM and project structure by @anistark in #2380
Migrate Faithfullness by @rhlbhatnagar in #2384
fix: docs for discrete, numeric and ranking using instructor by @anistark in #2397
Migrate Answer Accuracy + Context Relevance by @rhlbhatnagar in #2390
refactor: remove aspect critic and simple criteria metrics with discrete metric examples by @anistark in #2399
Migrate Context Pricision with + without ref by @rhlbhatnagar in #2398
docs: fix recall formula label in SQL metrics by @tysoncung in #2405
chore: remove deprecated ground_truths by @anistark in #2402
docs: Add documentation for metrics.collections API by @sanjeed5 in #2407
Migrate factual correctness by @rhlbhatnagar in #2401
refactor: remove redundant AnswerSimilarity from collections API in favor of SemanticSimilarity by @anistark in #2410
docs: Update documentation structure to reflect experiments-first paradigm by @sanjeed5 in #2394
Response Groundedness by @rhlbhatnagar in #2403
fix: office hours link update by @anistark in #2415
Refactor/removing deprecated by @anistark in #2412
fix: handle max_completeion_tokens error for newer openai models by @anistark in #2413
refactor: make embeddings optional in AnswerCorrectness when using pure factuality mode by @anistark in #2414
Feat/migrate context recall by @jjmachan in #2372
chore: update quickstart llm config by @anistark in #2417

New Contributors

@nkch1k made their first contribution in #2391
@dev-jonathan made their first contribution in #2389
@tysoncung made their first contribution in #2405

Full Changelog: v0.3.8...v0.3.9

Contributors

jjmachan, anistark, and 5 other contributors

Assets 2

28 Oct 19:09

anistark

v0.3.8

58f20a6

v0.3.8

What's Changed

feat: semantic similarity migrated to collections by @anistark in #2361
feat: Add reusable testing infrastructure for metrics migration by @jjmachan in #2370
add: console scripts for ragas_examples by @anistark in #2367
feat: add quickstart cmd with templates to run by @anistark in #2374
fix: detect uvloop and skip nest_asyncio to prevent patching errors by @anistark in #2369
Remove import not used by @ChenyangLi4288 in #2364
Migrate answer_correctness by @rhlbhatnagar in #2365
Migrate context_entity_recall by @rhlbhatnagar in #2366
feat: aspect critic metric for coherence, harmfulness, maliciousness, correctness by @anistark in #2375
Fixed: NameError during evalutation of llamaindex query engine by @Prigoistic in #2331
Remove error suppressor in async_utils.py and engine.py by @ChenyangLi4288 in #2362
docs: clarify Context Relevance implementation differs from paper design by @anistark in #2378
fix: add missing metrics (ToolCallF1, ChrfScore) to sidebar and document deprecated ContextUtilization by @anistark in #2381
refactor: instructor_llm_factory merge with llm_factory by @anistark in #2382
fix: handle tuple-formatted entities in SingleHopSpecificQuerySynthesizer by @anistark in #2377
feat: simple criteria migrated to collections by @anistark in #2386
chore: remove deprecation warnings for LangchainLLMWrapper, LlamaIndexLLMWrapper, and embedding wrappers by @anistark in #2387

New Contributors

@ChenyangLi4288 made their first contribution in #2364

Full Changelog: v0.3.7...v0.3.8

Contributors

jjmachan, anistark, and 3 other contributors

Assets 2

14 Oct 16:21

jjmachan

v0.3.7

f31c365

v0.3.7

What's Changed

refactor: improve metrics code quality by @anistark in #2337
chore: remove old analtyics by @jjmachan in #2338
Fix/query distribution robustness by @yatoyun in #2340
Simplify earlier how to guides in docs by @sanjeed5 in #2319
docs: reorganize prompt evaluation guides in navigation by @sanjeed5 in #2346
Metrics migration, migrate rouge + answer relevance by @rhlbhatnagar in #2335
fix: streamline theme extraction from overlaps in MultiHopSpecificQue… by @kenzoyan in #2347
Test/metric new compare by @anistark in #2349
feat: bleu score migrated to collections by @anistark in #2352
fix: Add List[List[str]] formats for overlapped items in theme extration (Continuation in #2347) by @kenzoyan in #2355
feat: string metrics migrated to collections by @anistark in #2356
feat: answer similarity migrated to collections by @anistark in #2358
fix: add missing props token_usage_parser for test generation methods #2359 by @bhkj9999 in #2360
feat: add bypass_n option to LangchainLLMWrapper for n-completion control by @SimFG in #2354
docs: Add how-to guide for aligning LLM-as-Judge by @sanjeed5 in #2348

New Contributors

@yatoyun made their first contribution in #2340
@kenzoyan made their first contribution in #2347
@bhkj9999 made their first contribution in #2360
@SimFG made their first contribution in #2354

Full Changelog: v0.3.6...v0.3.7

Contributors

jjmachan, anistark, and 6 other contributors

Assets 2

03 Oct 03:56

jjmachan

v0.3.6

49f47f1

v0.3.6

What's Changed

Feature/chrf score by @kauabh in #2221
Fix/asyncio by @anistark in #2294
Fix: update simple RAG init to use embed_text(s) (docs) by @s3pi in #2292
Update _bleu_score.py by @kauabh in #2297
Refactor/update gemini to genai sdk by @sahusiddharth in #2240
Feature/metrics input flexibility by @anistark in #2298
Ensure old_temperature is set correctly. Fixes #1937 and #2110 by @claudepi in #2295
Enhance EmbeddingExtractor to support both async and sync methods for… by @telesoho in #2286
Tokens counting by @anistark in #2299
Fix/tool call accuracy by @anistark in #2300
fix: coroutine warning for bleu by @anistark in #2301
Add base_url parameter to embedding_factor by @anistark in #2303
fix: add disallowed_special on tiktoken encode by @anistark in #2304
Feat/tool call f1 1893 by @anistark in #2305
Feature/azure token usage extraction by @anistark in #2306
fix: improve metric decorators with better validation and error handling by @jjmachan in #2302
Metric/parallel tool call by @anistark in #2307
Fix: avoid ambiguous truth value for empty numpy array in HuggingfaceEmbeddings (fixes #2080) by @Rahul2512Chauhan in #2308
Devpod cn/main by @anistark in #2309
Feat/quoted spans metric by @anistark in #2311
Fix noise sensitivity compute by @anistark in #2312
Corrected numerous typos in Markdown files. by @ker2xu in #1994
Deprecation warnings for LLMs and Prompts by @rhlbhatnagar in #2253
Docs/eval_rag_agent - how to evaluate and improve rag app by @sanjeed5 in #2293
Add llamaindex agentic evals gemini by @anistark in #2317
fix: type str in tests by @anistark in #2318
Fix generate_multiple caching issue (#1980) by @Rahul2512Chauhan in #2314
fix: metric inheritance patterns: separate factory-created metrics from class-instantiated metrics by @jjmachan in #2316
fix: concurrent ResponseRelevancy by @anistark in #2328
fix: answer_relevancy scoring logic to prevent false zero by @anistark in #2327
feat: Add OCI Gen AI Integration for Direct LLM Support by @harshil-sanghvi in #2321
feat: Add save/load functionality and improved repr for LLM-based metrics by @jjmachan in #2320
Fix: Fixed the Numpy 3.13 issue by @Prigoistic in #2282
refactor: docs and warnings for metric base new structure by @anistark in #2333
fix: typing by @anistark in #2334

New Contributors

@kauabh made their first contribution in #2221
@s3pi made their first contribution in #2292
@claudepi made their first contribution in #2295
@telesoho made their first contribution in #2286
@ker2xu made their first contribution in #1994
@harshil-sanghvi made their first contribution in #2321
@Prigoistic made their first contribution in #2282

Full Changelog: v0.3.5...v0.3.6

Contributors

jjmachan, anistark, and 11 other contributors

Assets 2

17 Sep 19:13

jjmachan

v0.3.5

6bde58a

v0.3.5

What's Changed

Docs/howto-texttosqlagent by @sanjeed5 in #2264
fix: preview logo was too small. by @anistark in #2277
modified the documentation to be in sync with current output format by @kotalaraghava in #2281
removed some meta properties to test by @jjmachan in #2278
feature: improve async / executor functionality by @ahgraber in #2070
modification of the translate instruction by @anistark in #2284
Remove experimental from docs and fix examples in docs by @sanjeed5 in #2270
fix: resolve TypeError in TopicAdherenceScore bitwise operations by @anistark in #2258
Knowledge graph/optimize for large corpus by @anistark in #2267
Update _nv_metrics.py by @titericz in #2053
Add telemetry by @rhlbhatnagar in #2260
OpenAI model cost by @anistark in #2287
docs: agent metrics code examples improvement by @yesidc in #1983
Prompt Optimization Tutorial by @sahusiddharth in #1993
Feature/metric type checking by @anistark in #2288
improved the release script for ragas-examples by @jjmachan in #2289
fix: removed the need for regex patterns by @jjmachan in #2290

New Contributors

@kotalaraghava made their first contribution in #2281
@yesidc made their first contribution in #1983

Full Changelog: v0.3.4...v0.3.5

Contributors

jjmachan, anistark, and 7 other contributors

Assets 2

17 Sep 17:40

jjmachan

v0.3.5rc2

6bde58a

v0.3.5rc2 Pre-release

Pre-release

fix: removed the need for regex patterns (#2290)

Assets 2

Releases: vibrantlabsai/ragas

v0.4.3

What's Changed

Contributors

Uh oh!

v0.4.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.4.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.4.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.9

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.8

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.7

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.6

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.5

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.5rc2

Uh oh!