A real-time AI tutoring agent powered by Gemini Live API, featuring a lip-synced 3D avatar, bidirectional voice interruption, and a live content panel that renders code, math on the fly.
Built for the Gemini Live Agent Challenge — Live Agents category.
Browser (index.html)
│
├── HTTPS ──────────────────► Firebase Hosting
│
└── WSS (WebSocket) ────────► Cloud Run (live_service.py)
│
├── Gemini Live API (bidi streaming)
└── Secret Manager (GEMINI_API_KEY)
Tech Stack:
- Frontend: Vanilla JS, Web Audio API (AudioWorklet), TalkingHead.js, Three.js
- Backend: Python 3.11, google-genai SDK, WebSockets
- AI: Gemini 2.5 Flash Native Audio (Live API)
- Cloud: Google Cloud Run, Secret Manager, Firebase Hosting
- Avatar: Ready Player Me + TalkingHead.js lipsync
- Real-time voice conversation — talk naturally, no typing required
- Bidirectional interruption — interrupt the tutor mid-sentence naturally
- Lip-synced 3D avatar — viseme-based lipsync driven by streamed audio
- Live content panel — code blocks, math (KaTeX), diagrams (Mermaid.js) rendered in real-time via Gemini tool calls
- Gapless audio playback — scheduled AudioBufferSourceNodes for smooth streaming
- Camera/screen mode — optionally stream webcam or screen to Gemini
eduverse/
├── index.html # Frontend — avatar, audio, WebSocket client
├── edu.glb # 3D avatar model (Ready Player Me)
├── .gitignore
├── README.md
└── backend/
├── live_service.py # Main backend — Gemini Live API + WebSocket server
├── image_input.py # Image input helper
├── gemini_service.py # Gemini service utilities
├── main.py # Entry point
├── requirements.txt # Python dependencies
├── Dockerfile # Cloud Run container
└── .env.example # Environment variable template
- Python 3.11+
- Node.js 20+ (for Firebase CLI, optional for local dev)
- A Gemini API key
- A webcam (optional — use
--mode noneto skip)
git clone https://github.com/Kareem-007/Eduverse.git
cd Eduversecd backend
python3 -m venv venv
source venv/bin/activate # Linux/Mac
# venv\Scripts\activate # Windowspip install -r requirements.txtcp .env.example .envEdit .env and add your Gemini API key:
GEMINI_API_KEY=your_gemini_api_key_here
# No camera (recommended for testing)
python live_service.py --mode none
under development and fixations
# With webcam
python live_service.py --mode camera
# With screen capture
python live_service.py --mode screenYou should see:
[WS] Server started on ws://localhost:8765
Open index.html directly in your browser:
# Linux
xdg-open ../index.html
# Mac
open ../index.html
# Or just drag index.html into Chrome/FirefoxNote: Use a browser that supports Web Audio API and
getUserMedia(Chrome recommended).
- Wait for the 3D avatar to load
- Click Start Session
- Wait for status to show CONNECTED
- Click Mic On to start talking
- Ask Eduverse anything — math, code, science, history
To fully test all features:
| Feature | How to test |
|---|---|
| Voice conversation | Click Mic On and ask a question |
| Interruption | Start talking while the avatar is speaking |
| Code rendering | Ask "show me a Python function for fibonacci" |
| Math rendering | Ask "explain the euler identity" |
| Lipsync | Watch the avatar's mouth sync to the audio |
The backend is configured for Google Cloud Run deployment.
- Google Cloud project with billing enabled
gcloudCLI installed and authenticated- Docker installed
# Set your project
gcloud config set project YOUR_PROJECT_ID
# Enable APIs
gcloud services enable run.googleapis.com artifactregistry.googleapis.com secretmanager.googleapis.com
# Store API key
echo -n "YOUR_GEMINI_API_KEY" | gcloud secrets create GEMINI_API_KEY --data-file=- --replication-policy=automatic
# Build and push image
gcloud artifacts repositories create eduverse-repo --repository-format=docker --location=us-central1
docker build -t us-central1-docker.pkg.dev/YOUR_PROJECT_ID/eduverse-repo/backend:latest ./backend
docker push us-central1-docker.pkg.dev/YOUR_PROJECT_ID/eduverse-repo/backend:latest
# Deploy to Cloud Run
gcloud run deploy eduverse-backend \
--image=us-central1-docker.pkg.dev/YOUR_PROJECT_ID/eduverse-repo/backend:latest \
--platform=managed \
--region=us-central1 \
--port=8765 \
--timeout=3600 \
--concurrency=1 \
--set-secrets="GEMINI_API_KEY=GEMINI_API_KEY:latest" \
--allow-unauthenticated \
--session-affinity- Gemini Live API —
gemini-2.5-flash-native-audio-previewviagoogle-genaiSDK - Cloud Run — Backend WebSocket server hosting
- Secret Manager — Secure API key storage
- Firebase Hosting — Frontend static file serving
- Artifact Registry — Docker image storage
| Variable | Description |
|---|---|
GEMINI_API_KEY |
Your Gemini API key from Google AI Studio |
MIT License — see LICENSE for details.
Built for the Gemini Live Agent Challenge 2026 — #GeminiLiveAgentChallenge