📞 AI Phone Agent

Intelligent AI-powered voice assistant for automating phone calls

✨ Features • 🚀 Quick Start • 🎭 Personas • 🏗️ Architecture • 📖 Documentation

🌟 Overview

AI Phone Agent is a cutting-edge voice-based AI application that conducts real-time phone conversations using Google Gemini's advanced audio streaming capabilities. It features speech-to-speech interaction with customizable AI personas for various use cases like booking reservations, handling customer calls, and providing tech support.

🎙️ Real-time Voice	🤖 Multiple Personas	📝 Live Transcription	🔊 Natural Speech
Bidirectional audio streaming	5 built-in presets + custom	See conversations in real-time	Multiple voice options

✨ Features

🗣️ Real-time Voice Conversations - Bidirectional audio streaming with Google Gemini
🎭 Customizable Personas - Switch between different AI personalities or create your own
📝 Live Transcription - See both user and agent speech transcribed in real-time
🔊 Multiple Voices - Choose from 5 different voice options (Puck, Charon, Kore, Fenrir, Zephyr)
⚡ Low Latency - Optimized audio pipeline for natural conversation flow
🎨 Modern UI - Clean, phone-like interface built with React and Tailwind CSS
📱 Responsive Design - Works seamlessly across devices

🚀 Quick Start

Prerequisites

📦 Node.js (v18 or higher recommended)
🔑 Google Gemini API Key - Get one at Google AI Studio

Installation

# Clone the repository
git clone https://github.com/yourusername/ai-phone-agent.git
cd ai-phone-agent

# Install dependencies
npm install

# Configure environment
cp .env.example .env.local

Configuration

Create a .env.local file in the root directory:

GEMINI_API_KEY=your_gemini_api_key_here

Running the App

# Start development server
npm run dev

🎉 Open http://localhost:3000 in your browser!

🎭 Personas

AI Phone Agent comes with 5 pre-configured personas for common use cases:

Persona	Description	Voice	Use Case
🧑‍💼 Personal Assistant	Helpful assistant for general tasks	Kore	General inquiries & tasks
🍽️ Restaurant Booker	Makes dinner reservations	Zephyr	Outbound booking calls
🏢 Business Receptionist	Answers calls for TechSolutions Inc	Puck	Inbound business calls
🔧 Tech Support	Troubleshoots internet issues	Fenrir	Customer support
📋 Call Screener	Screens incoming calls	Charon	Call filtering

Custom Personas

Create your own persona by configuring:

Name - Display name for the persona
Voice - Choose from available voices
System Instructions - Define the AI's behavior and role
Greeting - Initial message spoken when call starts

🛠️ Tech Stack

Category	Technology
⚛️ Frontend	React 19
📘 Language	TypeScript 5.8
⚡ Build Tool	Vite 6
🤖 AI/ML	Google Gemini SDK
🎨 Styling	Tailwind CSS
🔊 Audio	Web Audio API

🏗️ Architecture

ai-phone-agent/
├── 📁 components/           # React UI components
│   ├── CallScreen.tsx       # Main call interface & audio handling
│   ├── WelcomeScreen.tsx    # Persona selection screen
│   ├── StatusIndicator.tsx  # Call status display
│   └── Icons.tsx            # SVG icon components
├── 📁 services/
│   └── geminiService.ts     # Gemini API integration
├── 📁 utils/
│   └── audioUtils.ts        # Audio encoding utilities
├── 📄 App.tsx               # Root component
├── 📄 types.ts              # TypeScript definitions
├── 📄 constants.ts          # Config & persona presets
└── 📄 vite.config.ts        # Build configuration

Audio Pipeline

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Microphone │────▶│ 16kHz PCM    │────▶│   Gemini    │
│   Input     │     │ Base64 Encode│     │   Live API  │
└─────────────┘     └──────────────┘     └──────┬──────┘
                                                │
┌─────────────┐     ┌──────────────┐            │
│   Speaker   │◀────│ 24kHz Decode │◀───────────┘
│   Output    │     │ AudioBuffer  │
└─────────────┘     └──────────────┘

📝 Scripts

Command	Description
`npm run dev`	🚀 Start development server
`npm run build`	📦 Build for production
`npm run preview`	👁️ Preview production build

🔧 Configuration

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	✅ Yes	Your Google Gemini API key

Gemini Models Used

Live Conversations: gemini-2.5-flash-native-audio-preview-09-2025
Text-to-Speech: gemini-2.5-flash-preview-tts

📖 Documentation

CLAUDE.MD - AI assistant context and codebase guide
Google Gemini API - Gemini API documentation
React Documentation - React framework docs
Vite Guide - Vite build tool docs

🌐 Deployment

Production Build

# Create optimized build
npm run build

# Preview locally
npm run preview

The build output will be in the dist/ directory, ready for deployment to any static hosting service.

Hosting Options

▲ Vercel - Zero-config deployment
🔷 Netlify - Simple drag & drop
☁️ Google Cloud Run - Containerized deployment
🅰️ AWS Amplify - Full-stack hosting

Note: HTTPS is required for microphone access in production environments.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Google Gemini - Powering the AI conversations
React - UI framework
Vite - Lightning fast build tool
Tailwind CSS - Utility-first CSS framework

Built with Google Gemini by Anthony M

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.jules		.jules
components		components
coverage		coverage
services		services
tests		tests
utils		utils
.gitignore		.gitignore
App.test.tsx		App.test.tsx
App.tsx		App.tsx
CLAUDE.MD		CLAUDE.MD
README.md		README.md
constants.test.ts		constants.test.ts
constants.ts		constants.ts
index.html		index.html
index.tsx		index.tsx
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json
types.test.ts		types.test.ts
types.ts		types.ts
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📞 AI Phone Agent

🌟 Overview

✨ Features

🚀 Quick Start

Prerequisites

Installation

Configuration

Running the App

🎭 Personas

Custom Personas

🛠️ Tech Stack

🏗️ Architecture

Audio Pipeline

📝 Scripts

🔧 Configuration

Environment Variables

Gemini Models Used

📖 Documentation

🌐 Deployment

Production Build

Hosting Options

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

tblakex01/ai-phone-agent

Folders and files

Latest commit

History

Repository files navigation

📞 AI Phone Agent

🌟 Overview

✨ Features

🚀 Quick Start

Prerequisites

Installation

Configuration

Running the App

🎭 Personas

Custom Personas

🛠️ Tech Stack

🏗️ Architecture

Audio Pipeline

📝 Scripts

🔧 Configuration

Environment Variables

Gemini Models Used

📖 Documentation

🌐 Deployment

Production Build

Hosting Options

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages