Feature/46 retrieve 성능 향상 #52

ehddnr301 · 2025-04-18T08:53:50Z

#️⃣ Issue Number

Retrieve 성능 향상 #46

📝 요약(Summary)

retrieve 할때 reranking을 쓸수 있도록 streamlit에 버튼을 추가했습니다.
reranking시 Dongjin-kr/ko-reranker 모델을 통해 reranking 합니다.
당장 머지하려는 코드는 아니고 @DShomin 님 혹은 @anjaaaaeeeellll 님 코드가 머지되면 변경에 맞추어 streamlit을 변경 후에 머지를 하려고 합니다.

💬 To Reviewers (선택)

추후에 다른 retrieve 방법을 고려하여 우선 retrieve 하는 코드를 retrieve.py로 분리하였는데 적절한지 리뷰 부탁드립니다.

PR Checklist

TBD

reference) How to Code Review

따봉(👍): 리뷰어가 리뷰이의 코드에서 칭찬의 의견을 남기고 싶을 때 사용합니다.
느낌표(❗): 리뷰어가 리뷰이에게 필수적으로 코드 수정을 요청할 때 사용합니다.
물음표 (❓): 리뷰어가 리뷰이에게 의견을 물어보고 싶을 때 사용합니다.
알약 (💊): 리뷰어가 리뷰이의 코드에서 개선된 방법을 제안하지만 그것의 반영이 필수까지는 아닐 때 사용합니다.

- Add Reranker - Added new dependencies: langchain-huggingface==0.1.2 and transformers==4.51.2 to requirements.txt. - Removed the QueryRefinedAgainChain and its associated logic from chains.py and graph.py to streamline the query refinement process.

- add reranking feature in the Streamlit app to enhance search result accuracy. - Added new dependencies: transformers==4.51.2 and langchain-huggingface==0.1.2 to setup.py. - Created a new retrieval module to handle vector database interactions and reranking logic.

DShomin · 2025-04-19T08:01:55Z

👍 retrieve 기능을 별도로 분리 되는 것이 좋은 것 같아요
💊 rerank 도입 전후 성능이 어떻게 달라지는지 궁금하네요 검색되는 Table와 SQL 차이점 비교해보면 좋을 것 같네요(SQL 생성 시간도 비교하면 좋겠네요)

ehddnr301 · 2025-04-20T05:14:28Z

👍 retrieve 기능을 별도로 분리 되는 것이 좋은 것 같아요

💊 rerank 도입 전후 성능이 어떻게 달라지는지 궁금하네요 검색되는 Table와 SQL 차이점 비교해보면 좋을 것 같네요(SQL 생성 시간도 비교하면 좋겠네요)

회사에 실험적으로 도입했을때 체감으로는 rerank 도입후에 좀 더 정확한 테이블검색이 이루어지는것 같습니다. 어떻게 그걸 정량적으로 비교할지는 아직 잘 감이 안오네요...! SQL생성시간이 전체적인 시간을 의미하는것이라면 30초대로 기존 10초 아래로 걸리던것에 비해 좀 느려지긴했습니다. (대부분 reranker에서 소요)

ehddnr301 added 5 commits April 5, 2025 06:05

Implement QueryRefinedAgainChain for enhanced query refinement process

b14a396

Remove obsolete binary files

fa9652e

Remove debug print statement

c59211f

Refactor query handling in llm_utils

335fd9d

- Add Reranker - Added new dependencies: langchain-huggingface==0.1.2 and transformers==4.51.2 to requirements.txt. - Removed the QueryRefinedAgainChain and its associated logic from chains.py and graph.py to streamline the query refinement process.

ehddnr301 linked an issue Apr 18, 2025 that may be closed by this pull request

Retrieve 성능 향상 #46

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/46 retrieve 성능 향상 #52

Feature/46 retrieve 성능 향상 #52

ehddnr301 commented Apr 18, 2025

DShomin commented Apr 19, 2025

ehddnr301 commented Apr 20, 2025

Feature/46 retrieve 성능 향상 #52

Are you sure you want to change the base?

Feature/46 retrieve 성능 향상 #52

Conversation

ehddnr301 commented Apr 18, 2025

#️⃣ Issue Number

📝 요약(Summary)

💬 To Reviewers (선택)

PR Checklist

reference) How to Code Review

DShomin commented Apr 19, 2025

ehddnr301 commented Apr 20, 2025