A voice AI agent built with Pipecat and Plivo that can handle phone calls using speech-to-text, LLM, and text-to-speech.
- Real-time phone call handling via Plivo
- Speech-to-text using Deepgram
- AI responses using OpenAI GPT-4o-mini
- Text-to-speech using OpenAI TTS
- WebSocket streaming for low latency
- Python 3.10 or higher (3.12 recommended)
- Plivo account with phone number
- OpenAI API key
- Deepgram API key
- ngrok for local development
- Clone the repository:
git clone https://github.com/plivo/python-agents-examples.git
cd python-agents-examples/pipecat-plivo- Create virtual environment:
uv venv- Install dependencies:
uv pip install -r requirements.txt- Create
.envfile:
cp .env.example .envEdit .env with your actual API keys:
PLIVO_AUTH_ID=your_plivo_auth_id
PLIVO_AUTH_TOKEN=your_plivo_auth_token
OPENAI_API_KEY=your_openai_api_key
PLIVO_PHONE_NUMBER=+1234567890
DEEPGRAM_API_KEY=your_deepgram_api_key
NGROK_URL=https://your-ngrok-url.ngrok-free.app
- Start the voice agent:
python voice_agent.py- In another terminal, start ngrok:
ngrok http 8080-
Configure Plivo webhook with your ngrok URL:
- Answer URL:
https://your-ngrok-url.ngrok-free.app/answer - Method: POST
- Answer URL:
-
Call your Plivo number to test!
pipecat-plivo/ ├── voice_agent.py # Main application ├── .env # Environment variables (not in git) ├── .env.example # Example env file ├── .gitignore # Git ignore rules ├── README.md # This file └── requirements.txt # Python dependencies
See .env.example for required environment variables.
MIT
Pull requests are welcome!