🩺MediVision.AI : Your AI Doctor🧑‍⚕️

🚀 MediVision.AI is an advanced Generative AI-powered multimodal medical assistant, leveraging LLMs, speech processing, and vision-based analysis to provide AI-driven medical insights. Built using Gradio and integrated with LLama 3.3, ElevenLabs, and Groq APIs, this AI doctor is capable of speech-to-text (STT), text-to-speech (TTS), and medical query resolution.

🔗 Live Demo: MediVision.AI on Hugging Face

🌟 Features

✅ Generative AI-Powered Medical Assistance: Uses LLMs for medical query resolution.
✅ Multimodal Interaction: Accepts images, text, and voice inputs.
✅ Speech Processing: Supports STT (speech-to-text) and TTS (text-to-speech) via ElevenLabs and gTTS.
✅ Gradio UI: User-friendly web interface for seamless interactions.
✅ Cloud-Hosted: Deployed on Hugging Face Spaces for real-time access.

🏗️ Architecture

MediVision.AI follows a multi-agent AI architecture where different components handle specific tasks:

🧠 "Brain of the Doctor" (brain_of_the_doctor.py) – Calls Groq's multimodal LLM API to analyze queries and images.
🗣 "Voice of the Doctor" (voice_of_the_doctor.py) – Converts AI-generated responses into speech using ElevenLabs & gTTS.
🧑‍⚕️ "Voice of the Patient" (voice_of_the_patient.py) – Captures user voice input and transcribes it using speech_recognition & Groq API.
📟 "Gradio Interface" (gradio_app.py) – Integrates all components into an interactive web-based UI.

This modular design ensures seamless API-driven AI interactions.

🚀 Installation & Setup

🔧 Prerequisites

Ensure you have the following installed:

Python 3.8+
Pip & Virtualenv
FFmpeg (for voice processing)

📌 Installation Steps

# Clone the repository
git clone https://github.com/utkarshranaa/MediVision.AI.git
cd MediVision.AI

# Create and activate a virtual environment
python3 -m venv env
source env/bin/activate  # On Windows, use `env\Scripts\activate`

# Install dependencies
pip install -r requirements.txt

# Run the application
python gradio_app.py

⚡ Usage

Upload images, input symptoms, and use voice commands via the web interface.
The AI processes your input using Llama 3.3 LLM APIs and speech models.
Receive AI-driven responses, either as text or voice.

🔬 AI & Tech Stack

MediVision.AI integrates cutting-edge AI APIs:

Groq API: Multimodal LLM for text and image-based medical reasoning.
ElevenLabs & gTTS: Advanced text-to-speech (TTS) engines.
SpeechRecognition & pydub: Speech-to-text (STT) processing.
Gradio: Interactive AI-powered web UI.
Cloud Hosting: Hugging Face Spaces for real-time inference.

🚀 Roadmap

📌 Upcoming Enhancements:

🔹 Conversational Memory: AI remembers patient history.
🔹 Improved STT Models: Enhancing speech input accuracy.
🔹 Multilingual Support: Expanding to different languages.
🔹 Mobile App Version: Bringing AI diagnostics to mobile devices.

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

📞 Contact

For any inquiries or collaborations, reach out via GitHub or connect on LinkedIn!

🔗 Author: Utkarsh Ranaa
🔗 Project Repository: MediVision.AI on GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🩺MediVision.AI : Your AI Doctor🧑‍⚕️

🌟 Features

🏗️ Architecture

🚀 Installation & Setup

🔧 Prerequisites

📌 Installation Steps

⚡ Usage

🔬 AI & Tech Stack

🚀 Roadmap

📜 License

📞 Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
brain_of_the_doctor.py		brain_of_the_doctor.py
gradio_app.py		gradio_app.py
requirements.txt		requirements.txt
setup.sh		setup.sh
voice_of_the_doctor.py		voice_of_the_doctor.py
voice_of_the_patient.py		voice_of_the_patient.py

License

utkarshranaa/MediVision.AI

Folders and files

Latest commit

History

Repository files navigation

🩺MediVision.AI : Your AI Doctor🧑‍⚕️

🌟 Features

🏗️ Architecture

🚀 Installation & Setup

🔧 Prerequisites

📌 Installation Steps

⚡ Usage

🔬 AI & Tech Stack

🚀 Roadmap

📜 License

📞 Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages