Skip to content

feat: parallelize text and multimodal processing in process_document_complete#227

Open
ndcorder wants to merge 1 commit intoHKUDS:mainfrom
ndcorder:feat/parallel-text-multimodal
Open

feat: parallelize text and multimodal processing in process_document_complete#227
ndcorder wants to merge 1 commit intoHKUDS:mainfrom
ndcorder:feat/parallel-text-multimodal

Conversation

@ndcorder
Copy link
Copy Markdown

@ndcorder ndcorder commented Mar 21, 2026

Text insertion and multimodal processing in process_document_complete run sequentially right now but don't share any state. This runs them concurrently with asyncio.gather so total time is max(text, multimodal) instead of text + multimodal.

Uses return_exceptions=True so if one branch fails the other still completes.

These two steps are independent so there's no reason to wait for text
insertion before starting multimodal. Uses asyncio.gather with
return_exceptions so one failing doesn't kill the other.
@ndcorder ndcorder force-pushed the feat/parallel-text-multimodal branch from eede248 to ab1167c Compare March 21, 2026 09:31
@LarFii
Copy link
Copy Markdown
Collaborator

LarFii commented Mar 24, 2026

Thanks for your contribution!

I found one P1 issue. In raganything/processor.py, the new parallel branch uses asyncio.gather(..., return_exceptions=True) and only logs per-task failures. That means process_document_complete() no longer propagates text/multimodal failures, so the outer error handler is skipped and the method can still emit on_document_complete / "processing complete" even when one branch failed. This is a behavioral regression from main and can silently report partially ingested documents as successful. I think the PR should preserve fail-fast semantics here, or at minimum aggregate task errors and raise after both tasks finish. The new test test_one_branch_failing_does_not_block_the_other currently locks in the broken behavior rather than detecting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants