[RFC] Propose a new GenAIExample - visual search and QA #352

llin60 · 2025-04-15T08:59:59Z

This RFC proposes a new GenAIExample that integrates a multi-modal search engine with a visual QA assistant, so that the QnA assistant could be a better helper given the search results as visual context. The search engine and VQA assistant can also work independently as well.

This application serves as an excellent use case for industries such as surveillance, smart cities, and other domains requiring efficient analysis of large-scale visual data.

yinghu5 · 2025-04-17T01:18:58Z

Hi @llin60
Thank you a lot for new example, will escalate for further discussion.
Just for reference:
Visual QnA example: https://github.com/opea-project/GenAIExamples/tree/main/VisualQnA
MultimodalQnA: https://github.com/opea-project/GenAIExamples/tree/main/MultimodalQnA, which may comprehend a mix of textual, visual, and audio facts drawn from the document contents.

Copilot

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Copilot · 2025-04-17T01:20:25Z

community/rfcs/25-04-14-GenAIExamples-001-Visual_Search_and_QA.md

+curl http://localhost:6000/v1/embeddings
+-X POST
+-d '{"input":"traffic jam"}'
+-H 'Content-Type: application/json'


The curl example for the embeddings endpoint omits the HTTP method and headers, which may mislead users expecting a complete API request (e.g., missing '-X POST' and 'Content-Type: application/json'). Consider updating the example for consistency with the other endpoints.

Suggested change

curl http://localhost:6000/v1/embeddings

-X POST

-d '{"input":"traffic jam"}'

-H 'Content-Type: application/json'

curl -X POST http://localhost:6000/v1/embeddings \

-H "Content-Type: application/json" \

-d '{"input":"traffic jam"}'

Copilot · 2025-04-17T01:20:26Z

community/rfcs/25-04-14-GenAIExamples-001-Visual_Search_and_QA.md

+curl http://localhost:7000/v1/retrieval
+-X POST
+-d "{"embedding":${text_embedding},"search_type":"similarity", "k":4}"
+-H 'Content-Type: application/json'
+```
+


The curl command for the retrieval endpoint is missing important flags (such as '-X POST' and required headers) that are needed for a proper API call. Please revise the example to include these details for clarity.

Suggested change

curl http://localhost:7000/v1/retrieval

-X POST

-d "{"embedding":${text_embedding},"search_type":"similarity", "k":4}"

-H 'Content-Type: application/json'

```

curl http://localhost:7000/v1/retrieval \

-X POST \

-d '{"embedding":"<text_embedding_placeholder>","search_type":"similarity","k":4}' \

-H 'Content-Type: application/json'

# Replace <text_embedding_placeholder> with the actual text embedding value.

Copilot · 2025-04-17T01:20:26Z

community/rfcs/25-04-14-GenAIExamples-001-Visual_Search_and_QA.md

+curl http://localhost:8888/v1/dbsearch_qna
+-X POST
+-d '{"text":"traffic jam"}'
+-H 'Content-Type: application/json'


The curl example for the combined search and Q&A endpoint omits the HTTP method and payload details, which might lead to confusion. Consider including the POST method and a sample payload to match the other examples.

Suggested change

curl http://localhost:8888/v1/dbsearch_qna

-X POST

-d '{"text":"traffic jam"}'

-H 'Content-Type: application/json'

curl http://localhost:8888/v1/dbsearch_qna \

-X POST \

-d '{"text":"traffic jam", "context_images": ["image1.jpg", "image2.jpg"], "k": 5}' \

llin60 · 2025-04-17T07:05:43Z

Hi, thank you for the info. I've studied the existing examples for multi-modal applications. It seems that they process visual data by converting to text. However, in the application we are proposing, we need to store the visual data authentically, as the original images/videos are the targets in interest. Details can be found in the documentation.

add doc for visual search QnA

4a54239

llin60 requested review from chensuyue, ftian1, mkbhanda, preethivenkatesh, chickenrae and tomlenth as code owners April 15, 2025 09:00

yinghu5 requested review from Copilot and yinghu5 April 17, 2025 01:19

Copilot AI reviewed Apr 17, 2025

View reviewed changes

yinghu5 requested review from lvliang-intel and Spycsh and removed request for tomlenth April 18, 2025 03:21

yinghu5 added the A0 need to scrub label Apr 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Propose a new GenAIExample - visual search and QA #352

[RFC] Propose a new GenAIExample - visual search and QA #352

llin60 commented Apr 15, 2025

yinghu5 commented Apr 17, 2025

Copilot AI left a comment

Copilot AI Apr 17, 2025

Copilot AI Apr 17, 2025

Copilot AI Apr 17, 2025

llin60 commented Apr 17, 2025

[RFC] Propose a new GenAIExample - visual search and QA #352

Are you sure you want to change the base?

[RFC] Propose a new GenAIExample - visual search and QA #352

Conversation

llin60 commented Apr 15, 2025

yinghu5 commented Apr 17, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot AI Apr 17, 2025

Choose a reason for hiding this comment

Copilot AI Apr 17, 2025

Choose a reason for hiding this comment

Copilot AI Apr 17, 2025

Choose a reason for hiding this comment

llin60 commented Apr 17, 2025