Real-time meeting transcription with AI-powered insights. Captures live audio from your microphone, transcribes it in real-time using Google Cloud Speech-to-Text, and provides AI-generated summaries and answers using Google Gemini.
- Real-Time Transcription: Live speech-to-text as you speak, with interim results and automatic punctuation
- AI Meeting Assistant: Ask questions about the meeting or get automatic summaries powered by Gemini
- Streaming Responses: AI answers stream in real-time for immediate feedback
- WebSocket Architecture: Low-latency audio streaming for responsive transcription
- Audio Level Monitoring: Visual feedback showing microphone input levels
- Microphone Testing: Built-in mic test to verify audio setup before recording
- Transcript Download: Export your meeting transcripts as text files
- Dark Theme: Easy-on-the-eyes dark interface designed for extended use
| Layer | Technology |
|---|---|
| Frontend | Next.js 16, React 19, TypeScript, Tailwind CSS 4 |
| Backend | Node.js custom server with WebSocket support |
| Speech Recognition | Google Cloud Speech-to-Text (streaming API) |
| AI Assistant | Google Gemini via LangChain |
| Package Manager | pnpm (monorepo workspace) |
- Node.js 20+ (Node.js 25+ recommended)
- pnpm 10+ - Fast, disk space efficient package manager
- Google Cloud Account - For Speech-to-Text API
- Google AI API Key - For Gemini AI
# Clone the repository
git clone https://github.com/WSzP/live-meeting-helper.git
cd live-meeting-helper
# Install dependencies
pnpm installCreate a .env.local file in the project root:
# Google Cloud Speech-to-Text credentials
GOOGLE_APPLICATION_CREDENTIALS=.gcp/your-service-account.json
# Google AI (Gemini) API key
GOOGLE_API_KEY=your-google-ai-api-key
# Optional: Custom port (default: 3000)
PORT=3000- Create a Google Cloud project
- Enable the Speech-to-Text API
- Create a service account with Speech-to-Text permissions
- Download the JSON key file and place it in
.gcp/(do NOT commit it)
See docs/GOOGLE_CLOUD_SETUP.md for detailed instructions.
- Go to Google AI Studio
- Create an API key
- Add it to
.env.localasGOOGLE_API_KEY
# Start development server
pnpm devOpen http://localhost:3000 in your browser.
- Click Start Recording to begin
- Allow microphone access when prompted
- Speak clearly into your microphone
- Watch real-time transcription appear
- Click Ask AI to get insights about the meeting
- Click Stop Recording when finished
- Download your transcript
live-meeting-helper/
├── frontend/ # Next.js application
│ ├── src/
│ │ ├── app/ # App Router pages and layouts
│ │ ├── components/ # React components
│ │ └── hooks/ # Custom React hooks
│ ├── server.js # Custom Node.js server with WebSocket
│ ├── next.config.ts # Next.js configuration
│ ├── eslint.config.mjs # ESLint flat config
│ └── postcss.config.js # PostCSS/Tailwind configuration
├── docs/ # Documentation
├── .gcp/ # Google Cloud credentials (gitignored)
├── package.json # Root package.json (workspace)
└── pnpm-workspace.yaml # pnpm workspace configuration
- Audio Capture: Browser captures microphone audio using MediaRecorder API (WebM/Opus format)
- WebSocket Streaming: Audio chunks are streamed to the server via WebSocket every 250ms
- Speech Recognition: Server pipes audio to Google Cloud Speech-to-Text streaming API
- Real-Time Results: Interim and final transcription results are sent back to the client
- AI Processing: When requested, transcript is sent to Gemini (gemini-3-flash-preview) via LangChain, with responses streaming back through the same WebSocket connection
┌─────────────┐ WebSocket ┌─────────────────┐
│ Browser │ ◄────────────────► │ Node.js Server │
│ (React) │ Audio + Text │ │
└─────────────┘ └────────┬────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌──────────────┐
│ Google Cloud │ │ Google AI │ │ Next.js │
│ Speech-to-Text│ │ (Gemini) │ │ (SSR/CSR) │
└───────────────┘ └───────────────┘ └──────────────┘
# Install dependencies
pnpm install
# Start development server
pnpm dev
# Build for production
pnpm build
# Start production server
pnpm start
# Run linting
pnpm lint- 5-minute streaming limit: Google Cloud Speech-to-Text has a ~305 second limit per stream. The app automatically restarts streams to handle longer meetings.
- Browser support: Requires a modern browser with MediaRecorder API support (Chrome, Firefox, Edge)
- English only: Currently configured for
en-US. Can be modified inserver.js
| Name | Hex | Usage |
|---|---|---|
| Night Navy | #040932 |
Background |
| Cloud Lilac | #F3F1FA |
Primary text color over dark backgrounds |
| Pulse Cyan | #0BB2F2 |
"Live" feeling - realtime indicators, highlights, badges, waveform accents |
In the logo text: Live and Helper use Cloud Lilac, Meeting uses Pulse Cyan.
| Asset | Path | Usage |
|---|---|---|
| Logo (PNG) | assets/lmh-logo.png |
High quality |
| Logo (WebP) | assets/lmh-logo.webp |
Web optimized |
| Open Graph | assets/lmh-open-graph.png |
Social media previews |
Apache License 2.0
Created by Peter W. Szabo, 2026.

