-
Notifications
You must be signed in to change notification settings - Fork 0
feat: saturn query pipeline for rag optimisation #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: saturn query pipeline for rag optimisation #23
Conversation
|
Could we use Langchain Go's embedding framework, it support Voyage embedding model. |
the model that i am using is not supported yet |
|
todo: inject query enhancement prompt instead of hardcoding in application |
tommynguyen-vungle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* feat: saturn query pipeline for rag optimisation * feat: remove hardcoded limit * feat: remove unused metadata * feat: todo comments * feat: todo comments * feat: decouple query rewriting and rag search * chore: remove unused comments * fix: fix missing s3 config, make embedding model configurable * feat: debug voyage api key * feat: substitute rag embedding provider env var, remove debug log * feat: add IRSA support to service account template * feat: add logs * feat: add observability for query enhancement * fix: fix empty input and tool name for tool-execution span * feat: added embedding span and fixed incorrect token usage * feat: vector search span * feat: make date filter field configurable * feat: let llm handles the date window * feat: inject query enhancement prompt * feat: handle corrupted metadata * fix: fix race condition in S3Provider.Initialize() * perf(rag): optimize result sorting from O(n²) to O(n log n) * fix: sort dates in descending order, better for LLM * fix: fix test * fix: fix golangci lint err * fix: remove redundant metadata filtering * refactor: dates filter are stored as int * refactor: dates filter are stored as int * fix: fix lint
* PE-7777: Claude Sonnet 4.5 integration * Support thinking for Claude Sonnet 4.5 * include Thinking Output In Response * Fixed thinking messaged deletion to get thread replies (tuannvm#143) Signed-off-by: rangamani54 <[email protected]> * ci(cursor): Add Cursor automated code-review workflow Signed-off-by: Tommy Nguyen <[email protected]> * feat: saturn query pipeline for rag optimisation (#23) * feat: saturn query pipeline for rag optimisation * feat: remove hardcoded limit * feat: remove unused metadata * feat: todo comments * feat: todo comments * feat: decouple query rewriting and rag search * chore: remove unused comments * fix: fix missing s3 config, make embedding model configurable * feat: debug voyage api key * feat: substitute rag embedding provider env var, remove debug log * feat: add IRSA support to service account template * feat: add logs * feat: add observability for query enhancement * fix: fix empty input and tool name for tool-execution span * feat: added embedding span and fixed incorrect token usage * feat: vector search span * feat: make date filter field configurable * feat: let llm handles the date window * feat: inject query enhancement prompt * feat: handle corrupted metadata * fix: fix race condition in S3Provider.Initialize() * perf(rag): optimize result sorting from O(n²) to O(n log n) * fix: sort dates in descending order, better for LLM * fix: fix test * fix: fix golangci lint err * fix: remove redundant metadata filtering * refactor: dates filter are stored as int * refactor: dates filter are stored as int * fix: fix lint * update CLAUDE.md --------- Signed-off-by: rangamani54 <[email protected]> Signed-off-by: Tommy Nguyen <[email protected]> Co-authored-by: Ranga Mani Kumar <[email protected]> Co-authored-by: Tommy Nguyen <[email protected]> Co-authored-by: yekkhan-liftoff <[email protected]>
Summary
Enhanced RAG pipeline with query enhancement and voyage embedding. Query enhancement happens at Slack client layer for all queries, embeddings generated via Voyage AI, and S3 vectors used for similarity search.
Flow
Stage 1: Query Enhancement (Slack Layer)
User Query: "What were Q3 2025 revenues for VX in APAC?"
↓
Slack Client → Query Enhancer (Claude Sonnet)
↓
EnhanceQuery(query, today="2025-11-03")
↓
Analyze: temporal query detected
↓
Output: {
enhancedQuery: "Q3 2025 VX revenues APAC",
metadata: { generatedDate: "2025-09-30" }
}
↓
Date Range Expanded: 2025-09-23 to 2025-09-30 (7-day window)
Stage 2: LLM Tool Detection
Slack Client → Main LLM (GPT-4.1)
↓
CallLLM(enhancedQuery, systemPrompt)
↓
LLM detects: needs rag_search tool
↓
Output: ToolCall {
name: "rag_search",
args: { query: "Q3 2025 VX revenues APAC" }
}
Stage 3: RAG Search with Embeddings
Slack Client → LLM-MCP Bridge
↓
ProcessLLMResponse(extraArgs: {metadata})
↓
Bridge → RAG Client
↓
CallTool("rag_search", args + metadata)
↓
RAG extracts metadata from args
↓
Build SearchOptions {
limit: 30,
dateFilter: ["2025-09-23"..."2025-09-30"]
}
↓
RAG Client → Voyage AI Embedding Provider
↓
POST /v1/embeddings
model: voyage-context-3
dimensions: 1024
↓
Output: queryVector [1024 floats]
↓
RAG Client → AWS S3 Vectors
↓
QueryVectors {
vector: queryVector,
topK: 7,
filter: { report_generated_date: {$in: dates} }
}
↓
S3: Cosine similarity search + metadata filtering
↓
Output: 7 results (scored & sorted)
↓
RAG: sortResultsByDate() - newest first
↓
Format: "Found 7 contexts...
--- Context 1 ---
Source: revenue_report.pdf
Date: 2025-09-30
Content: ..."
Stage 4: Final Synthesis
Bridge → Main LLM
↓
Re-prompt {
query: enhancedQuery,
context: RAG results
}
↓
LLM synthesizes answer from context
↓
Output: "Q3 2025 VX revenues in APAC were..."
↓
Bridge → Slack Client → User