EPUB Audiobook Reader

Turn any EPUB into an audiobook with AI voice cloning. Powered by Faster Qwen3-TTS.

Features

Voice Cloning — clone any voice from a short audio sample
Custom Voices — use built-in Qwen3-TTS speaker voices
Real-time Streaming — audio plays as it's generated, no waiting
Chapter Navigation — browse and select chapters from the sidebar
EPUB Library — books are saved locally for quick access
Full Audiobook Export — generate and download a complete WAV file
Voice Manager — record reference audio directly in the browser or upload files
Dark / Light Mode — with system preference detection
Bilingual — automatic Chinese / English language detection and chunking

Requirements

Python 3.10+
NVIDIA GPU with CUDA support
~4GB VRAM for the 0.6B model, ~8GB for the 1.7B model

Installation

git clone https://github.com/williamcotton/epub-audiobook-reader.git
cd epub-audiobook-reader

# Install PyTorch for your CUDA version first (see https://pytorch.org/get-started/locally/)
# Example for CUDA 12.8 (RTX 50-series):
uv pip install torch --index-url https://download.pytorch.org/whl/cu128

# Then install the project
uv sync

Quick Start

# Start with the default 1.7B model (auto-downloads from HuggingFace)
epub-audiobook-reader

# Use the smaller 0.6B model
epub-audiobook-reader --model Qwen/Qwen3-TTS-12Hz-0.6B-Base

# Start without preloading a model (select in the UI)
epub-audiobook-reader --no-preload

# Enable HTTPS for microphone recording support
epub-audiobook-reader --ssl

Then open http://localhost:7861 in your browser.

CLI Options

Option	Default	Description
`--model`	`Qwen/Qwen3-TTS-12Hz-1.7B-Base`	Model to preload at startup
`--port`	`7861`	Server port
`--host`	`0.0.0.0`	Server host
`--no-preload`	off	Skip model loading at startup
`--ssl`	off	Enable HTTPS with auto-generated self-signed certificate

Environment Variables

Variable	Default	Description
`QWEN3_TTS_ROOT`	—	Local directory containing downloaded models
`PORT`	`7861`	Server port (overridden by `--port`)
`MODEL_CACHE_SIZE`	`2`	Maximum number of models to keep loaded
`ACTIVE_MODELS`	all	Comma-separated list of models to show in the UI
`ASSET_DIR`	`/tmp/faster-qwen3-tts-assets`	Directory for downloaded reference audio assets

Available Models

Model	Type	Size	Description
`Qwen/Qwen3-TTS-12Hz-0.6B-Base`	Voice Clone	~0.6B	Smaller, faster model
`Qwen/Qwen3-TTS-12Hz-1.7B-Base`	Voice Clone	~1.7B	Higher quality
`Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice`	Custom Voice	~0.6B	Built-in speakers, smaller
`Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice`	Custom Voice	~1.7B	Built-in speakers, higher quality

How It Works

Upload an EPUB — the book is parsed into chapters and text segments
Select a voice — choose a built-in preset, upload reference audio, or record your own
Play — click any chapter to start streaming audio in real-time
Export — optionally generate a full audiobook WAV file for offline listening

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
epub_audiobook_reader		epub_audiobook_reader
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EPUB Audiobook Reader

Features

Requirements

Installation

Quick Start

CLI Options

Environment Variables

Available Models

How It Works

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EPUB Audiobook Reader

Features

Requirements

Installation

Quick Start

CLI Options

Environment Variables

Available Models

How It Works

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages