An AI-powered agent that automatically completes Duolingo lessons using GPT-4o Vision and OpenAI Whisper. The agent takes screenshots like a human, understands the exercise, and clicks/types the answer automatically — no HTML parsing needed.
| Script | Course | UI Language |
|---|---|---|
duolingo_cn_for_en.py |
Chinese (Mandarin) for English speakers | English |
duolingo_en_for_vi.py |
English for Vietnamese speakers | Vietnamese |
- Fully automatic - Detects exercise type, answers questions, starts new lessons, and loops continuously
- Multiple exercise types - Image choice, multiple choice, word bank, typing, matching pairs, listening, tap pairs, tracing (skip)
- Chinese character support - Handles mixed hanzi + pinyin text (e.g. "汤tāng") with smart extraction (CN script)
- Vietnamese UI support - Handles Vietnamese interface strings and additional exercise types (VI script)
- Answer caching - Learns correct answers from Duolingo feedback to improve accuracy (VI script)
- Listening exercises - Captures system audio via WASAPI loopback and transcribes with OpenAI Whisper
- Audio exercises - Audio matching, audio fill-in-the-blank, and listen-and-type (VI script)
- Human-like behavior - Random delays between actions, deliberate wrong answers (0-2 per lesson) to avoid detection
- Heart/lives management - Monitors hearts and auto-switches to practice mode when running low
- XP tracking - Reports XP gained before and after each session
- Session persistence - Saves login session to avoid re-authentication
- GitHub Actions support - Run daily practice automatically via CI/CD
- Graceful shutdown - Press Ctrl+C to stop and see XP summary
- Logs in to Duolingo (or loads a saved session)
- Checks hearts — if low, switches to free practice mode
- Starts a lesson automatically
- Enters a fully automatic loop:
- Captures a screenshot of the current screen
- Sends it to GPT-4o Vision, which returns structured JSON with exercise type, correct answer, and actions
- Executes the actions (clicking words, typing answers, pressing keyboard shortcuts)
- Clicks Check/Continue to proceed
- After each lesson, starts the next one until
MAX_LESSONSis reached
| Type | How it answers |
|---|---|
| Image choice | Presses the number key of the correct image (1, 2, or 3) |
| Multiple choice | Clicks the correct option or presses number shortcut |
| Word bank | Clicks words in the correct order (handles hanzi/pinyin tokens) |
| Typing | Types the answer in the text field |
| Matching pairs | Uses keyboard shortcuts (1-5 left, 6-0 right) |
| Listening | Records system audio via Whisper (zh) then clicks/types answer |
| Tap pairs | Clicks matching pairs sequentially |
| Tracing | Skipped automatically |
All of the above, plus:
| Type | How it answers |
|---|---|
| Checkbox | Reading comprehension — checks all correct answers |
| Audio matching | Matches English audio to Vietnamese text (or vice versa) |
| Audio fill-in-the-blank | Listens to audio, fills in the missing word |
| Listen and type | Listens to audio and types the full sentence |
| Speaking | Skipped automatically (no microphone) |
- Python 3.8+
- An OpenAI API key with GPT-4o access
- A Duolingo account
- Clone the repository:
git clone https://github.com/your-username/duolingo-ai-agent.git
cd duolingo-ai-agent- Install dependencies:
pip install -r requirements.txt- Install the Playwright browser:
playwright install chromium- Create a
.envfile from the example:
cp .env.example .env- Fill in your credentials in
.env:
OPENAI_API_KEY=your_openai_key
# Chinese for English speakers
DUO_EMAIL=your_email
DUO_PASSWORD=your_password
DUO_PROFILE_URL=your_duolingo_username
# English for Vietnamese speakers
VI_DUO_EMAIL=your_email
VI_DUO_PASSWORD=your_password
VI_DUO_PROFILE_URL=your_duolingo_username
python duolingo_cn_for_en.pypython duolingo_en_for_vi.pyThe agent will log in, check XP, and start completing lessons automatically. Press Ctrl+C at any time to stop and see your XP summary.
| Variable | Script | Description | Default |
|---|---|---|---|
OPENAI_API_KEY |
Both | OpenAI API key | Required |
DUO_EMAIL |
CN | Duolingo email (Chinese course) | Required |
DUO_PASSWORD |
CN | Duolingo password (Chinese course) | Required |
DUO_PROFILE_URL |
CN | Duolingo username for XP tracking | Optional |
VI_DUO_EMAIL |
VI | Duolingo email (English course) | Required |
VI_DUO_PASSWORD |
VI | Duolingo password (English course) | Required |
VI_DUO_PROFILE_URL |
VI | Duolingo username for XP tracking | Optional |
DUO_JWT |
Both | Pre-authenticated JWT token (recommended for CI) | Optional |
MAX_LESSONS |
Both | Number of lessons to complete (0 = unlimited) | 0 |
HEADLESS |
Both | Run browser without UI (true/false) |
false |
Duolingo's login API may block automated requests (reCAPTCHA, rate limits). The most reliable way to authenticate in CI is using a JWT token from your browser session.
- Log in to Duolingo in your browser
- Open the browser DevTools console (F12 > Console)
- Run this script to copy your JWT token:
document.cookie.split(';').find(cookie => cookie.includes('jwt_token')).split('=')[1]- Copy the output — that's your JWT token
Note: JWT tokens expire periodically. If the agent stops working, repeat the steps above to get a fresh token.
- Go to your repo Settings > Secrets and variables > Actions
- Add these secrets:
DUO_EMAIL— your Duolingo emailDUO_PASSWORD— your Duolingo passwordDUO_JWT— your JWT token (recommended, see Getting Your JWT Token)OPENAI_API_KEY— your OpenAI API keyDUO_PROFILE_URL— your Duolingo username (optional, for XP tracking)
- The workflow runs daily at 7:00 AM UTC, or trigger manually from the Actions tab
- Each run executes 2-3 rounds with 5-20 lessons each, with random sleep intervals between rounds
Note: Listening exercises are automatically skipped in CI (no audio device). The agent will click "CAN'T LISTEN NOW" and continue.
- Playwright - Browser automation
- OpenAI GPT-4o - Vision model for screenshot analysis
- OpenAI Whisper - Audio transcription for listening exercises
- sounddevice + soundfile - System audio capture via WASAPI loopback
- python-dotenv - Environment variable management
duolingo-ai-agent/
├── .github/
│ └── workflows/
│ └── duolingo.yml # GitHub Actions daily practice
├── duolingo_cn_for_en.py # Chinese for English speakers
├── duolingo_en_for_vi.py # English for Vietnamese speakers
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
└── README.md