Skip to content

Conversation

@yekkhan-liftoff
Copy link

@yekkhan-liftoff yekkhan-liftoff commented Oct 31, 2025

Summary

Enhanced RAG pipeline with query enhancement and voyage embedding. Query enhancement happens at Slack client layer for all queries, embeddings generated via Voyage AI, and S3 vectors used for similarity search.

Flow

Stage 1: Query Enhancement (Slack Layer)

User Query: "What were Q3 2025 revenues for VX in APAC?"

Slack Client → Query Enhancer (Claude Sonnet)

EnhanceQuery(query, today="2025-11-03")

Analyze: temporal query detected

Output: {
enhancedQuery: "Q3 2025 VX revenues APAC",
metadata: { generatedDate: "2025-09-30" }
}

Date Range Expanded: 2025-09-23 to 2025-09-30 (7-day window)

Stage 2: LLM Tool Detection

Slack Client → Main LLM (GPT-4.1)

CallLLM(enhancedQuery, systemPrompt)

LLM detects: needs rag_search tool

Output: ToolCall {
name: "rag_search",
args: { query: "Q3 2025 VX revenues APAC" }
}

Stage 3: RAG Search with Embeddings

Slack Client → LLM-MCP Bridge

ProcessLLMResponse(extraArgs: {metadata})

Bridge → RAG Client

CallTool("rag_search", args + metadata)

RAG extracts metadata from args

Build SearchOptions {
limit: 30,
dateFilter: ["2025-09-23"..."2025-09-30"]
}

RAG Client → Voyage AI Embedding Provider

POST /v1/embeddings
model: voyage-context-3
dimensions: 1024

Output: queryVector [1024 floats]

RAG Client → AWS S3 Vectors

QueryVectors {
vector: queryVector,
topK: 7,
filter: { report_generated_date: {$in: dates} }
}

S3: Cosine similarity search + metadata filtering

Output: 7 results (scored & sorted)

RAG: sortResultsByDate() - newest first

Format: "Found 7 contexts...
--- Context 1 ---
Source: revenue_report.pdf
Date: 2025-09-30
Content: ..."

Stage 4: Final Synthesis

Bridge → Main LLM

Re-prompt {
query: enhancedQuery,
context: RAG results
}

LLM synthesizes answer from context

Output: "Q3 2025 VX revenues in APAC were..."

Bridge → Slack Client → User

@wyangsun
Copy link

wyangsun commented Nov 3, 2025

Could we use Langchain Go's embedding framework, it support Voyage embedding model.
https://github.com/tmc/langchaingo/blob/main/embeddings/voyageai/voyageai.go
https://github.com/tmc/langchaingo/blob/main/embeddings/embedding.go

@yekkhan-liftoff
Copy link
Author

Could we use Langchain Go's embedding framework, it support Voyage embedding model. https://github.com/tmc/langchaingo/blob/main/embeddings/voyageai/voyageai.go https://github.com/tmc/langchaingo/blob/main/embeddings/embedding.go

the model that i am using is not supported yet

@yekkhan-liftoff
Copy link
Author

todo: inject query enhancement prompt instead of hardcoding in application

Copy link

@tommynguyen-vungle tommynguyen-vungle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yekkhan-liftoff yekkhan-liftoff merged commit 3cf3966 into main Nov 24, 2025
6 of 7 checks passed
wyangsun pushed a commit that referenced this pull request Nov 25, 2025
* feat: saturn query pipeline for rag optimisation

* feat: remove hardcoded limit

* feat: remove unused metadata

* feat: todo comments

* feat: todo comments

* feat: decouple query rewriting and rag search

* chore: remove unused comments

* fix: fix missing s3 config, make embedding model configurable

* feat: debug voyage api key

* feat: substitute rag embedding provider env var, remove debug log

* feat: add IRSA support to service account template

* feat: add logs

* feat: add observability for query enhancement

* fix: fix empty input and tool name for tool-execution span

* feat: added embedding span and fixed incorrect token usage

* feat: vector search span

* feat: make date filter field configurable

* feat: let llm handles the date window

* feat: inject query enhancement prompt

* feat: handle corrupted metadata

* fix: fix race condition in S3Provider.Initialize()

* perf(rag): optimize result sorting from O(n²) to O(n log n)

* fix: sort dates in descending order, better for LLM

* fix: fix test

* fix: fix golangci lint err

* fix: remove redundant metadata filtering

* refactor: dates filter are stored as int

* refactor: dates filter are stored as int

* fix: fix lint
wyangsun added a commit that referenced this pull request Nov 25, 2025
* PE-7777: Claude Sonnet 4.5 integration

* Support thinking for Claude Sonnet 4.5

* include Thinking Output In Response

* Fixed thinking messaged deletion to get thread replies (tuannvm#143)

Signed-off-by: rangamani54 <[email protected]>

* ci(cursor): Add Cursor automated code-review workflow

Signed-off-by: Tommy Nguyen <[email protected]>

* feat: saturn query pipeline for rag optimisation (#23)

* feat: saturn query pipeline for rag optimisation

* feat: remove hardcoded limit

* feat: remove unused metadata

* feat: todo comments

* feat: todo comments

* feat: decouple query rewriting and rag search

* chore: remove unused comments

* fix: fix missing s3 config, make embedding model configurable

* feat: debug voyage api key

* feat: substitute rag embedding provider env var, remove debug log

* feat: add IRSA support to service account template

* feat: add logs

* feat: add observability for query enhancement

* fix: fix empty input and tool name for tool-execution span

* feat: added embedding span and fixed incorrect token usage

* feat: vector search span

* feat: make date filter field configurable

* feat: let llm handles the date window

* feat: inject query enhancement prompt

* feat: handle corrupted metadata

* fix: fix race condition in S3Provider.Initialize()

* perf(rag): optimize result sorting from O(n²) to O(n log n)

* fix: sort dates in descending order, better for LLM

* fix: fix test

* fix: fix golangci lint err

* fix: remove redundant metadata filtering

* refactor: dates filter are stored as int

* refactor: dates filter are stored as int

* fix: fix lint

* update CLAUDE.md

---------

Signed-off-by: rangamani54 <[email protected]>
Signed-off-by: Tommy Nguyen <[email protected]>
Co-authored-by: Ranga Mani Kumar <[email protected]>
Co-authored-by: Tommy Nguyen <[email protected]>
Co-authored-by: yekkhan-liftoff <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants