A Python-based voice assistant that uses speech recognition, text-to-speech, and AI language models to provide conversational AI capabilities.
- 🎤 Speech recognition using Google Speech Recognition
- 🗣️ Text-to-speech output with adjustable speech rate
- 🤖 AI-powered responses using Ollama's Mistral model
- 💭 Conversation history tracking
- 🔄 Continuous conversation loop
- Python 3.8 or higher
- Microphone for speech input
- Internet connection for speech recognition
- Ollama installed with Mistral model
git clone <repository-url>
cd voice-assistantpip install speech-recognition pyttsx3 langchain-community langchain-core langchain-ollama pyaudioOn macOS/Linux:
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull the Mistral model
ollama pull mistralOn Windows:
- Download Ollama from ollama.ai
- Install and run:
ollama pull mistral
On macOS:
brew install portaudioOn Ubuntu/Debian:
sudo apt-get install portaudio19-dev python3-pyaudioOn Windows:
- PyAudio should install automatically with pip
-
Start the voice assistant:
python app.py
-
Interact with the assistant:
- Wait for "Listening..." prompt
- Speak your question or command
- Listen to the AI response
- Continue the conversation
-
Exit the assistant:
- Say "exit" or "stop" to end the session
Modify the speech rate in app.py:
engine.setProperty('rate', 160) # Adjust value (default: 160)Replace "mistral" with another Ollama model:
llm = OllamaLLM(model="llama2") # or other models"Could not request results" error:
- Check internet connection
- Verify microphone permissions
"No module named 'pyaudio'" error:
# On macOS
brew install portaudio
pip install pyaudio
# On Linux
sudo apt-get install portaudio19-dev
pip install pyaudioOllama connection error:
- Ensure Ollama is running:
ollama serve - Verify Mistral model is installed:
ollama list
Microphone not working:
- Check system microphone permissions
- Test microphone with other applications
- Try different microphone devices
voice-assistant/
├── app.py # Main application file
├── readme.md # This file
└── requirements.txt # Python dependencies (optional)
speech_recognition: For converting speech to textpyttsx3: For text-to-speech conversionlangchain-community: For chat message historylangchain-core: For prompt templateslangchain-ollama: For Ollama LLM integrationpyaudio: For microphone audio capture
Feel free to submit issues and enhancement requests!
This project is open source and available under the MIT License.