audio-to-doc-search

A Streamlit-based application that lets users upload or record audio questions, transcribes them using ElevenLabs, retrieves relevant context from documents using RAG, and generates answers with an LLM via Ollama.

Directory Structure:

`app/`

Contains the core application modules:

rag_pipeline.py – Loads the vector database, retrieves context chunks, and defines get_llm_response(query) to call the LLM.
llm_ollama.py – Handles communication with the Ollama server for LLM inference.
stt_elevenlabs.py – Wraps the ElevenLabs Speech-to-Text API to transcribe uploaded or recorded audio.
tts_elevenlabs.py – Wraps the ElevenLabs Text-to-Speech API to synthesize audio from text responses.
utils.py – Utility functions for configuration loading, file handling, and shared helpers.
__init__.py – Marks app/ as a Python package.

`sit-data/`

Holds sample SIT (System Integration Testing) documents used to build and test the RAG retrieval workflows.

`vector_context/`

Stores precomputed vector embeddings and context files for fast similarity search during the RAG process.

`.gitignore`

Specifies files and directories (e.g., virtual environments, temporary uploads) that Git should ignore.

`LICENSE`

The full text of the MIT License, under which this project is released.

`README.md`

This file—provides an overview and concise descriptions of the repository contents.

`requirements.txt`

Lists all Python dependencies required to run the application (Streamlit, Requests, ElevenLabs & Ollama SDKs, etc.).

`streamlit_app.py`

The main Streamlit application:

Audio upload/recording
STT transcription
RAG query to the LLM
TTS playback of the response

`streamlit_app_medical.py`

A domain-specific variant of streamlit_app.py tailored for medical SIT data and prompts.

`streamlit_app_sit.py`

A specialized Streamlit script demonstrating end-to-end queries against the SIT dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audio-to-doc-search

Directory Structure:

`app/`

`sit-data/`

`vector_context/`

`.gitignore`

`LICENSE`

`README.md`

`requirements.txt`

`streamlit_app.py`

`streamlit_app_medical.py`

`streamlit_app_sit.py`

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
app		app
sit-data		sit-data
vector_context		vector_context
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
streamlit_app_medical.py		streamlit_app_medical.py
streamlit_app_sit.py		streamlit_app_sit.py

Folders and files

Latest commit

History

Repository files navigation

audio-to-doc-search

Directory Structure:

app/

sit-data/

vector_context/

.gitignore

LICENSE

README.md

requirements.txt

streamlit_app.py

streamlit_app_medical.py

streamlit_app_sit.py

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`app/`

`sit-data/`

`vector_context/`

`.gitignore`

`LICENSE`

`README.md`

`requirements.txt`

`streamlit_app.py`

`streamlit_app_medical.py`

`streamlit_app_sit.py`

Packages