feat: upgrade evaluation pipeline with W&B and update docs by Western-1 · Pull Request #2 · Western-1/rag-doc-chat

Western-1 · 2026-01-01T16:42:08Z

📝 CHANGELOG

All notable changes to the Talk to Your Docs RAG System.

[3.1.0] - 2026-01-01 - W&B Evaluation Pipeline

✨ New Features

Weights & Biases Integration: Added full support for experiment tracking.
- New script evaluation/track_experiment.py.
- Logs Ragas metrics (Faithfulness, Precision, Recall) to W&B cloud.
- Logs detailed pandas DataFrames with Q&A pairs for analysis.
Auto-Ingestion (Cold Start): The evaluation pipeline now detects if Qdrant is empty.
- Automatically generates a synthetic PDF (test_data_autogen.pdf) using ReportLab.
- Ingests data and cleans up automatically (Zero-setup testing).
Robust Evaluator Class: Introduced RAGWandbEvaluator class for cleaner, modular evaluation logic.

🔧 Improvements / Performance

Groq Rate Limit Handling: Implemented a "Monkey Patching" mechanism for ChatGroq.
- Intercepts invoke calls to enforce a 10s delay between requests.
- Prevents 429 Too Many Requests errors on Free Tier (8k TPM limit).
- Uses object.__setattr__ to bypass Pydantic validation on LangChain objects.
Clean CLI Output: Silenced noisy loggers (httpx, groq, httpcore, qdrant_client) during evaluation.
Increased Resilience: Updated RunConfig with max_retries=10 and timeout=600s for long-running evaluations.

🐛 Bug Fixes

Pydantic Validation Bypass: Fixed ValueError: "ChatGroq" object has no field "invoke" by using direct attribute setting.
LangChain Prompt Handling: Fixed Chain invocation failed error in src/rag.py.
- Added check: if isinstance(lc_prompt, str) to convert string prompts from Langfuse into ChatPromptTemplate.
Git Hygiene: Updated .gitignore to strictly exclude wandb/ local directories and artifacts.

📚 Documentation

README.md:
- Added Evaluation & Tracking section.
- Added comparison tables for "Baseline" vs "Tracked" results.
- Added W&B integration screenshot.
- Added make track command documentation.
New Assets: Added docs/rag-eval-metrics-wandb.png.

📂 Files Changed

Added:
- evaluation/track_experiment.py
- images/rag-eval-metrics-wandb.png
Modified:
- src/rag.py (Fixed prompt template type error)
- .gitignore (Added wandb rules)
- README.md (Added evaluation docs)

Upgrade Steps for Evaluation

Install new dependencies:
```
pip install reportlab wandb
```
Run tracked experiment:
```
make track
```

feat: upgrade evaluation pipeline with W&B and update docs

5134696

Western-1 merged commit 3b320e1 into main Jan 1, 2026
1 check passed

Western-1 deleted the feat/wandb-integration branch January 1, 2026 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: upgrade evaluation pipeline with W&B and update docs#2

feat: upgrade evaluation pipeline with W&B and update docs#2
Western-1 merged 1 commit intomainfrom
feat/wandb-integration

Western-1 commented Jan 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Western-1 commented Jan 1, 2026

📝 CHANGELOG

[3.1.0] - 2026-01-01 - W&B Evaluation Pipeline

✨ New Features

🔧 Improvements / Performance

🐛 Bug Fixes

📚 Documentation

📂 Files Changed

Upgrade Steps for Evaluation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant