Releases: vibrantlabsai/ragas
Releases ยท vibrantlabsai/ragas
v0.4.3
What's Changed
- feat: add
DSPyOptimizerwithMIPROv2for advanced prompt optimization by @anistark in #2537 - feat(docs): add llms.txt generation for LLM-friendly documentation by @sanjeed5 in #2539
- feat: dspy caching by @anistark in #2542
- feat: add system prompt support for
InstructorLLMandLiteLLMStructuredLLMby @anistark in #2543 - feat(docs): add copy-to-llm button for easy AI tool integration by @sanjeed5 in #2541
- fix: use PAT token for docs-check CI as docs-apply CI by @anistark in #2546
- feat: add remaining quickstart templates by @anistark in #2547
- fix: enable FactualCorrectness language adaptation by @anistark in #2555
- fix: resolve
DiskCacheBackendpickling issue withInstructorLLMby @anistark in #2556 - fix: lazy init DEFAULT_TOKENIZER to avoid network calls at import time. by @cgaswin in #2545
- fix: comment on failed task by @anistark in #2557
- docs: fix DiscreteMetric llm examples to match API by @cgaswin in #2558
- fix: add repository parameter to checkout action for fork PR support by @anistark in #2559
Full Changelog: v0.4.2...v0.4.3
v0.4.2
What's Changed
- feat: migrate SQLSemanticEquivalence to collections API by @anistark in #2496
- feat: migrate DataCompyScore to collections API by @anistark in #2499
- fix: migrate CHRF Score to new metrics collections by @anistark in #2500
- Feat/improve rag quickstart by @anistark in #2501
- fix: handle classification array length mismatch in TopicAdherence metric by @anistark in #2503
- feat: migrate quoted spans metric to collections api by @anistark in #2504
- fix: handle instructor modes for json and tools by @anistark in #2505
- docs: remove obsolete 'write your own metric' guides by @sanjeed5 in #2488
- fix: BasePrompt.adapt() structured output + language adaptation guide by @sanjeed5 in #2487
- Add AG-UI Protocol Integration for Agent Evaluation by @contextablemark in #2395
- docs: fix modifying-prompts-metrics guide with correct API by @sanjeed5 in #2486
- fix: handle nested dicts/lists in ToolCallF1 args to prevent unhashable type error by @anistark in #2507
- feat: add Claude docs auto-update workflow by @sanjeed5 in #2508
- fix: allow unannotated parameters to accept any type by @dhyaneesh in #2513
- feat: migrate MultiModalFaithfulness to collections API by @anistark in #2515
- feat: migrate MultiModalRelevance to collections API by @anistark in #2518
- feat: add support for new google-genai SDK with backwards compatibility for deprecated google-generativeai by @anistark in #2517
- fix: use instructor Mode.JSON for litellm and generic providers to fix Dict type validation errors by @anistark in #2514
- [AG-UI] Replacing "OpenAI" with "AsyncOpenAI". by @contextablemark in #2520
- Feat: document PDF export workflow by @cgaswin in #2522
- refactor: deprecate legacy metrics by @anistark in #2519
- fix: claude workflows to use github token by @anistark in #2527
- feat: add HuggingFace tokenizer support in knowledge graph operations by @anistark in #2524
- feat: add generate_with_chunks for pre-chunked documents by @MinseongS in #2526
- feat(docs): add offline mermaid support to PDF export by @cgaswin in #2530
- fix: claude workflows to use pat token instead of github token to support forked PRs by @anistark in #2531
- docs: change query execution to use asyncio.run by @yangzq50 in #2532
- fix: increase max-turns and update prompt by @anistark in #2534
- chore: remove survey link by @anistark in #2535
- feat: add caching support for metrics collections by @anistark in #2533
- feat: add caching support for embeddings by @anistark in #2536
New Contributors
- @contextablemark made their first contribution in #2395
- @cgaswin made their first contribution in #2522
- @MinseongS made their first contribution in #2526
- @yangzq50 made their first contribution in #2532
Full Changelog: v0.4.1...v0.4.2
v0.4.1
What's Changed
- feat: add save/load methods to BasePrompt by @anistark in #2465
- docs: update run_config guide to use collections API by @sanjeed5 in #2468
- fix: add anthropic and gemini clients for custom clients by @anistark in #2472
- feat: migrate
ToolCallAccuracyto collections API by @anistark in #2476 - chore: add survey link to readme and docs banner by @anistark in #2478
- Fix: apply_transforms(kg, transforms, run_config=run_config or RunConโฆ by @narabi in #2480
- feat: migrate ToolCallF1 to collections API with set-based F1 scoring by @anistark in #2483
- New input parameter for the TestsetGenerator Class : LLM_CONTEXT by @narabi in #2474
- feat: migrate TopicAdherence and AgentGoalAccuracy to collections API by @anistark in #2485
- docs: organize integrations nav into collapsible groups by @sanjeed5 in #2492
- aembed_text() replace embed_text() in AnswerRelevancy by @anexplore in #2495
- docs: remove broken reference to train_your_own_metric.md by @sanjeed5 in #2489
- feat: migrate Rubrics Metrics to collections API by @anistark in #2494
New Contributors
- @narabi made their first contribution in #2480
- @anexplore made their first contribution in #2495
Full Changelog: v0.4.0...v0.4.1
v0.4.0
What's Changed
- docs: complete collections API documentation for remaining metrics by @sanjeed5 in #2420
- feat: support GPT-5 and o-series models with automatic
temperatureandtop_pconstraint handling by @anistark in #2418 - update: add llm options as tabs to quickstart by @anistark in #2421
- feat: migrate to instructor.from_provider for universal provider support by @anistark in #2424
- docs: fix typos in some files by @Edge-Seven in #2429
- feat: implement prompt class for context precision by @anistark in #2433
- fix: docs for quickstart by @anistark in #2434
- Fix 2432: Update import statements for langchain modules by @rvernica in #2436
- Fix 2440 Adjust LLM parameters in evals.py by @rvernica in #2441
- feat: migrate context recall, answer relevancy, and context entity recall metrics to modular BasePrompt architecture by @anistark in #2435
- feat: migrate 6 metrics (ContextRelevance, Response Groundedness, AnswerAccuracy, Faithfulness, AnswerCorrectness, SummaryScore) to BasePrompt by @anistark in #2443
- feat: migrate final metrics (FactualCorrectness, NoiseSensitivity) to modular BasePrompt architecture and update docs by @anistark in #2444
- chore: add COC by @anistark in #2437
- docs: clarify MLflow is required, not optional in RAG evaluation guide by @anistark in #2447
- chore: cleanup old patterns and update links by @anistark in #2449
- chore: rebranding efforts by @jjmachan in #2445
- feat: dual adapter support (Instructor + LiteLLM) by @anistark in #2446
- fix: resolve InstructorLLM detection bug and add EvaluationDataset backend support for experiments by @anistark in #2451
- fix: retrieved_contexts string filtering in LangChain integration by @dhyaneesh in #2452
- fix: correct MultiTurnSample user_input validation logic by @harshil-sanghvi in #2426
- fix: automatic embedding provider matching for LLMs by @anistark in #2454
- fix: make GitPython an optional dependency by @anistark in #2453
- docs: Update customizations how-to guides to use collections API and LLM factory by @sanjeed5 in #2425
- fix: detect async clients in closures for instructor-wrapped litellm routers by @anistark in #2458
- fix: quickstart by @anistark in #2463
- chore: update calendar email by @anistark in #2462
- fix: make GoogleEmbeddings handle GenerativeModel clients by auto-extracting genai module by @anistark in #2466
- docs: add migration guide for v0.4 by @anistark in #2461
New Contributors
- @Edge-Seven made their first contribution in #2429
- @rvernica made their first contribution in #2436
- @dhyaneesh made their first contribution in #2452
Full Changelog: v0.3.9...v0.4.0
v0.3.9
What's Changed
- fix(docs): add missing line break so the step title and description aโฆ by @nkch1k in #2391
- Migrate SummaryScore by @rhlbhatnagar in #2376
- feat: add metadata fields for synthetic data traceability by @dev-jonathan in #2389
- Migrate noise sensitivity by @rhlbhatnagar in #2379
- docs: quickstart guide with interactive LLM and project structure by @anistark in #2380
- Migrate Faithfullness by @rhlbhatnagar in #2384
- fix: docs for discrete, numeric and ranking using instructor by @anistark in #2397
- Migrate Answer Accuracy + Context Relevance by @rhlbhatnagar in #2390
- refactor: remove aspect critic and simple criteria metrics with discrete metric examples by @anistark in #2399
- Migrate Context Pricision with + without ref by @rhlbhatnagar in #2398
- docs: fix recall formula label in SQL metrics by @tysoncung in #2405
- chore: remove deprecated
ground_truthsby @anistark in #2402 - docs: Add documentation for metrics.collections API by @sanjeed5 in #2407
- Migrate factual correctness by @rhlbhatnagar in #2401
- refactor: remove redundant AnswerSimilarity from collections API in favor of SemanticSimilarity by @anistark in #2410
- docs: Update documentation structure to reflect experiments-first paradigm by @sanjeed5 in #2394
- Response Groundedness by @rhlbhatnagar in #2403
- fix: office hours link update by @anistark in #2415
- Refactor/removing deprecated by @anistark in #2412
- fix: handle
max_completeion_tokenserror for newer openai models by @anistark in #2413 - refactor: make embeddings optional in AnswerCorrectness when using pure factuality mode by @anistark in #2414
- Feat/migrate context recall by @jjmachan in #2372
- chore: update quickstart llm config by @anistark in #2417
New Contributors
- @nkch1k made their first contribution in #2391
- @dev-jonathan made their first contribution in #2389
- @tysoncung made their first contribution in #2405
Full Changelog: v0.3.8...v0.3.9
v0.3.8
What's Changed
- feat: semantic similarity migrated to collections by @anistark in #2361
- feat: Add reusable testing infrastructure for metrics migration by @jjmachan in #2370
- add: console scripts for ragas_examples by @anistark in #2367
- feat: add quickstart cmd with templates to run by @anistark in #2374
- fix: detect uvloop and skip nest_asyncio to prevent patching errors by @anistark in #2369
- Remove import not used by @ChenyangLi4288 in #2364
- Migrate answer_correctness by @rhlbhatnagar in #2365
- Migrate context_entity_recall by @rhlbhatnagar in #2366
- feat: aspect critic metric for coherence, harmfulness, maliciousness, correctness by @anistark in #2375
- Fixed: NameError during evalutation of llamaindex query engine by @Prigoistic in #2331
- Remove error suppressor in async_utils.py and engine.py by @ChenyangLi4288 in #2362
- docs: clarify Context Relevance implementation differs from paper design by @anistark in #2378
- fix: add missing metrics (ToolCallF1, ChrfScore) to sidebar and document deprecated ContextUtilization by @anistark in #2381
- refactor:
instructor_llm_factorymerge withllm_factoryby @anistark in #2382 - fix: handle tuple-formatted entities in SingleHopSpecificQuerySynthesizer by @anistark in #2377
- feat: simple criteria migrated to collections by @anistark in #2386
- chore: remove deprecation warnings for LangchainLLMWrapper, LlamaIndexLLMWrapper, and embedding wrappers by @anistark in #2387
New Contributors
- @ChenyangLi4288 made their first contribution in #2364
Full Changelog: v0.3.7...v0.3.8
v0.3.7
What's Changed
- refactor: improve metrics code quality by @anistark in #2337
- chore: remove old analtyics by @jjmachan in #2338
- Fix/query distribution robustness by @yatoyun in #2340
- Simplify earlier how to guides in docs by @sanjeed5 in #2319
- docs: reorganize prompt evaluation guides in navigation by @sanjeed5 in #2346
- Metrics migration, migrate rouge + answer relevance by @rhlbhatnagar in #2335
- fix: streamline theme extraction from overlaps in MultiHopSpecificQueโฆ by @kenzoyan in #2347
- Test/metric new compare by @anistark in #2349
- feat: bleu score migrated to collections by @anistark in #2352
- fix: Add List[List[str]] formats for overlapped items in theme extration (Continuation in #2347) by @kenzoyan in #2355
- feat: string metrics migrated to collections by @anistark in #2356
- feat: answer similarity migrated to collections by @anistark in #2358
- fix: add missing props token_usage_parser for test generation methods #2359 by @bhkj9999 in #2360
- feat: add bypass_n option to LangchainLLMWrapper for n-completion control by @SimFG in #2354
- docs: Add how-to guide for aligning LLM-as-Judge by @sanjeed5 in #2348
New Contributors
- @yatoyun made their first contribution in #2340
- @kenzoyan made their first contribution in #2347
- @bhkj9999 made their first contribution in #2360
- @SimFG made their first contribution in #2354
Full Changelog: v0.3.6...v0.3.7
v0.3.6
What's Changed
- Feature/chrf score by @kauabh in #2221
- Fix/asyncio by @anistark in #2294
- Fix: update simple RAG init to use embed_text(s) (docs) by @s3pi in #2292
- Update _bleu_score.py by @kauabh in #2297
- Refactor/update gemini to genai sdk by @sahusiddharth in #2240
- Feature/metrics input flexibility by @anistark in #2298
- Ensure old_temperature is set correctly. Fixes #1937 and #2110 by @claudepi in #2295
- Enhance EmbeddingExtractor to support both async and sync methods forโฆ by @telesoho in #2286
- Tokens counting by @anistark in #2299
- Fix/tool call accuracy by @anistark in #2300
- fix: coroutine warning for bleu by @anistark in #2301
- Add base_url parameter to embedding_factor by @anistark in #2303
- fix: add disallowed_special on tiktoken encode by @anistark in #2304
- Feat/tool call f1 1893 by @anistark in #2305
- Feature/azure token usage extraction by @anistark in #2306
- fix: improve metric decorators with better validation and error handling by @jjmachan in #2302
- Metric/parallel tool call by @anistark in #2307
- Fix: avoid ambiguous truth value for empty numpy array in HuggingfaceEmbeddings (fixes #2080) by @Rahul2512Chauhan in #2308
- Devpod cn/main by @anistark in #2309
- Feat/quoted spans metric by @anistark in #2311
- Fix noise sensitivity compute by @anistark in #2312
- Corrected numerous typos in Markdown files. by @ker2xu in #1994
- Deprecation warnings for LLMs and Prompts by @rhlbhatnagar in #2253
- Docs/eval_rag_agent - how to evaluate and improve rag app by @sanjeed5 in #2293
- Add llamaindex agentic evals gemini by @anistark in #2317
- fix: type str in tests by @anistark in #2318
- Fix generate_multiple caching issue (#1980) by @Rahul2512Chauhan in #2314
- fix: metric inheritance patterns: separate factory-created metrics from class-instantiated metrics by @jjmachan in #2316
- fix: concurrent ResponseRelevancy by @anistark in #2328
- fix: answer_relevancy scoring logic to prevent false zero by @anistark in #2327
- feat: Add OCI Gen AI Integration for Direct LLM Support by @harshil-sanghvi in #2321
- feat: Add save/load functionality and improved repr for LLM-based metrics by @jjmachan in #2320
- Fix: Fixed the Numpy 3.13 issue by @Prigoistic in #2282
- refactor: docs and warnings for metric base new structure by @anistark in #2333
- fix: typing by @anistark in #2334
New Contributors
- @kauabh made their first contribution in #2221
- @s3pi made their first contribution in #2292
- @claudepi made their first contribution in #2295
- @telesoho made their first contribution in #2286
- @ker2xu made their first contribution in #1994
- @harshil-sanghvi made their first contribution in #2321
- @Prigoistic made their first contribution in #2282
Full Changelog: v0.3.5...v0.3.6
v0.3.5
What's Changed
- Docs/howto-texttosqlagent by @sanjeed5 in #2264
- fix: preview logo was too small. by @anistark in #2277
- modified the documentation to be in sync with current output format by @kotalaraghava in #2281
- removed some meta properties to test by @jjmachan in #2278
- feature: improve async / executor functionality by @ahgraber in #2070
- modification of the translate instruction by @anistark in #2284
- Remove experimental from docs and fix examples in docs by @sanjeed5 in #2270
- fix: resolve TypeError in TopicAdherenceScore bitwise operations by @anistark in #2258
- Knowledge graph/optimize for large corpus by @anistark in #2267
- Update _nv_metrics.py by @titericz in #2053
- Add telemetry by @rhlbhatnagar in #2260
- OpenAI model cost by @anistark in #2287
- docs: agent metrics code examples improvement by @yesidc in #1983
- Prompt Optimization Tutorial by @sahusiddharth in #1993
- Feature/metric type checking by @anistark in #2288
- improved the release script for
ragas-examplesby @jjmachan in #2289 - fix: removed the need for regex patterns by @jjmachan in #2290
New Contributors
- @kotalaraghava made their first contribution in #2281
- @yesidc made their first contribution in #1983
Full Changelog: v0.3.4...v0.3.5