Skip to content

sleep3r/toodles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

50 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

toodles

Telegram Γ— Gemini CLI β€” streamed responses, voice, file & photo sharing, local transcription

Quick Start Β· Features Β· Config Β· Architecture

Rust Telegram Gemini License


A Telegram bot written in Rust that wraps gemini-cli, letting you chat with Gemini AI directly from Telegram β€” with real-time streaming, voice transcription, photo & file analysis, and per-topic session isolation.

✨ Features

Feature Details
πŸ’¬ Real-time streaming In-place draft updates while the model is generating, then final formatted commit
⏳ Instant feedback Immediate startup placeholder (ΠŸΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π°ΡŽ Gemini-ΡΠ΅ΡΡΠΈΡŽβ€¦) on cold starts
πŸ›‘ Stop generation Inline "πŸ›‘ Stop" button to cancel generation mid-stream
πŸ“ Smart message splitting Long responses auto-split into multiple Telegram messages at newline boundaries β€” no truncation
⚠️ Error feedback Session startup and runtime errors are surfaced to the user (no silent failure)
πŸ“· Photo analysis Send photos (including albums) β€” batched via aggregator and analyzed by Gemini Vision
πŸ“„ Document handling Send files (PDF, XLSX, etc.) β€” downloaded and forwarded to gemini-cli for processing
πŸ“Ž File sharing Gemini can send files back via the ATTACH_FILE: protocol
🧩 Message aggregation Sequential messages within 1.5s are batched into a single prompt β€” handles albums, forwarded batches, and split messages
πŸ”₯ Warm session pool Keeps prewarmed ACP sessions to reduce first-response latency (WARM_SESSION_POOL_SIZE)
♻️ Session startup retries Automatic retry with backoff when ACP initialization fails transiently
πŸŽ™ Voice messages Transcribed locally via Parakeet V3 or cloud via OpenAI Whisper
🧠 Local transcription Offline, no API keys β€” NVIDIA Parakeet ONNX (int8, ~478 MB)
πŸ“Œ Forum topics Each Telegram topic gets an isolated gemini-cli session
🏷️ Thread auto-title First message sets topic title; later updates use recent-context summaries
πŸ”„ Session management /new starts fresh, /status shows active count
πŸ”’ Access control Optional user allowlist via ALLOWED_USER_IDS
πŸ–₯️ macOS background service launchd targets keep the bot running 24/7 with auto-restart
πŸ§™ Setup wizard Interactive --setup generates .env with guided prompts
🎨 Customisable prompt System prompt configurable via SYSTEM_PROMPT in .env
βœ… CI-gated check + fmt + clippy + test on every push/PR

πŸš€ Quick Start

Prerequisites

  • Rust β‰₯ 1.70 β€” rustup.rs
  • gemini-cli β€” npm install -g @google/gemini-cli && gemini
  • Telegram bot token β€” @BotFather
  • ffmpeg β€” brew install ffmpeg (required for voice messages)
  • (Optional) OpenAI API key β€” for cloud Whisper fallback

Install & Run

git clone https://github.com/sleep3r/toodles
cd toodles

# Option A: Interactive setup wizard (recommended)
make setup

# Option B: Manual config
cp .env.example .env
$EDITOR .env

# Run
make run            # debug
make release        # optimized build
make run-release    # run optimized

# Optional: install as macOS launchd service (24/7)
make service-install

Run as macOS background service (launchd)

make service-install   # build release + install + start
make service-status    # check launchd state
make service-logs      # tail bot logs

service-install copies your project .env into ~/.config/toodles/service.env so launchd can read secrets consistently.

After code changes:

make service-update    # rebuild release + restart service

If you change .env, run make service-update to sync it into the service env file.

Stop / remove service:

make service-stop
make service-uninstall

Optional overrides (passed as Make variables):

make LAUNCHD_LABEL=com.alex.toodles service-install
make TOODLES_ENV_FILE=/path/to/.env service-install
make LAUNCHD_WORKDIR=/Users/alexander service-install

πŸ’¬ How It Works

 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Telegram  │───────▢│ toodles  │───────▢│  gemini-cli  β”‚
 β”‚   user    │◀─ edit β”‚  (Rust)  │◀─ pipe β”‚  subprocess  β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  msg   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  stdoutβ””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. User sends a message (text, photo, document, or voice)
  2. Messages are aggregated within a 1.5s window (handles albums and split messages)
  3. On cold start, a startup status is shown while ACP session is created (or grabbed from warm pool)
  4. A draft placeholder with πŸ›‘ Stop is attached and updated during generation
  5. User can press Stop at any time β€” generation is cancelled via CancellationToken
  6. Final response is committed with Markdown→Telegram HTML formatting and plain-text fallback
  7. Subsequent messages reuse the same topic/chat session automatically

πŸŽ™ Voice Transcription

toodles supports two transcription backends:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Telegram Voice   │────▢│    ffmpeg     │────▢│ Parakeet  │──── text
β”‚    (OGG Opus)      β”‚     β”‚  (16kHz f32)  β”‚     β”‚   V3 🦜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                                                      β”‚ fallback
                                                β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                                                β”‚  OpenAI    β”‚
                                                β”‚ Whisper 🌐 β”‚
                                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Mode Latency Cost Setup
Local (Parakeet V3) ~2-5s Free --setup downloads 478 MB model
Cloud (Whisper API) ~1-3s ~$0.006/min Requires OPENAI_API_KEY

If both are enabled, local transcription is tried first with automatic cloud fallback.

βš™οΈ Configuration

All configuration is managed through environment variables or .env:

# Required
TELEGRAM_BOT_TOKEN=123456:ABC-DEF...

# Access control (leave empty for unrestricted)
ALLOWED_USER_IDS=123456789,987654321

# Gemini CLI
GEMINI_CLI_PATH=gemini                # path to binary
GEMINI_CLI_COMMAND=gemini --acp       # optional full ACP command
GEMINI_WORKING_DIR=/path/to/project   # optional cwd
GEMINI_YOLO=true                      # optional auto-approve mode
DRAFT_MODE=verbose                    # compact | verbose draft UX
THREAD_RENAME_EVERY=4                 # 0 disables auto-rename
WARM_SESSION_POOL_SIZE=1              # 0 disables warm prewarmed pool

# Optional: read additional settings from TOML
TOODLES_CONFIG=~/.config/toodles/config.toml

# System prompt β€” customise the bot's personality
SYSTEM_PROMPT=You are a helpful AI assistant. Keep answers concise.

# Voice β€” cloud (optional fallback)
OPENAI_API_KEY=sk-...

# Voice β€” local (recommended)
USE_LOCAL_TRANSCRIPTION=true
MODELS_DIR=~/.toodles/models

# Logging
RUST_LOG=info

πŸ’‘ Tip: Run make setup to generate this interactively!

Optional TOML config

You can also keep settings in ~/.config/toodles/config.toml:

bot_token = "123456:ABC-DEF..."
gemini_cli_command = "gemini --acp"
gemini_working_dir = "/path/to/project"
gemini_yolo = true
draft_mode = "verbose"
thread_rename_every = 4
warm_session_pool_size = 1

You can copy config.example.toml as a starting point.

πŸ€– Bot Commands

Command Description
/start Get started πŸ‘‹
/new Start fresh πŸ”„
/status Bot status πŸ“Š
/thread Create forum thread 🧡
/help Show commands πŸ’‘

/thread works in forum-enabled supergroups where the bot has topic-management rights. You can call /thread from both the main chat and existing topics; Toodles creates a new topic in the same group. The first user message in a topic sets its initial title, then Toodles refreshes the title every THREAD_RENAME_EVERY messages using the recent message context.

🧯 Cold-Start Tuning

If the first response sometimes takes too long:

  • Set WARM_SESSION_POOL_SIZE=1 (or 2) to keep prewarmed ACP sessions ready.
  • Keep GEMINI_WORKING_DIR on a local SSD path (avoid slow network mounts).
  • Check bot logs for repeated ACP initialize retries; transient failures are retried automatically.

If /thread fails with "not enough rights to create a topic", grant the bot admin permission to manage topics.

πŸ“ Architecture

src/
β”œβ”€β”€ main.rs             β€” entry point, dispatcher, bot commands
β”œβ”€β”€ config.rs           β€” Config from env + optional TOML (single gemini profile)
β”œβ”€β”€ session.rs          β€” ACP session lifecycle + per-chat/topic session mapping
β”œβ”€β”€ aggregator.rs       β€” message batching with debounce window + file guard ownership
β”œβ”€β”€ telegram_api.rs     β€” raw Telegram API (sendMessageDraft), global HTTP client
β”œβ”€β”€ setup.rs            β€” interactive setup wizard (--setup)
β”œβ”€β”€ transcription.rs    β€” Parakeet V3 engine + model download
└── handlers/
    β”œβ”€β”€ mod.rs           β€” CancelRegistry, inline stop button, draft streaming, message splitting, Markdownβ†’HTML
    β”œβ”€β”€ message.rs       β€” text message handler (with aggregation)
    β”œβ”€β”€ document.rs      β€” document/file handler (download + aggregate + query)
    β”œβ”€β”€ photo.rs         β€” photo handler (download + aggregate albums + query)
    └── voice.rs         β€” voice handler (transcribe β†’ query)

Session lifecycle:

stateDiagram-v2
    [*] --> New: /new or first message
    New --> Ready: session created
    Ready --> Query: user message
    Query --> Placeholder: ⏳ + πŸ›‘ Stop button
    Placeholder --> Streaming: line-by-line via BufReader
    Streaming --> Cancelled: user clicks πŸ›‘
    Cancelled --> Ready: ⬛ Generation stopped
    Streaming --> Ready: response committed (Markdown)
    Ready --> [*]: /new (reset)
Loading

Each chat or forum topic maps to an isolated ACP session. Queries are serialised per session via tokio::sync::Mutex and a per-session queue. Startup uses retries and an optional warm pool (WARM_SESSION_POOL_SIZE) to reduce first-token latency. During generation, the bot updates one placeholder message (draft UX), supports inline cancellation via CancellationToken, and commits a final Markdown→Telegram HTML response with plain-text fallback. Long responses are split across multiple Telegram messages at newline boundaries. Sequential messages and photo albums are aggregated via a 1.5s debounce window. Temporary files (photos, documents) are kept alive via Arc<TempFileGuard> until the query completes.

πŸ›  Makefile

make help          # show all targets
make build         # debug build
make release       # optimized build
make run           # run (debug)
make run-release   # run (release)
make setup         # interactive setup wizard
make test          # run tests
make lint          # clippy
make fmt           # format code
make clean         # clean artifacts
make service-install   # install/start launchd service
make service-sync-env  # copy .env into launchd service env
make service-update    # rebuild + restart launchd service
make service-stop      # stop launchd service
make service-status    # print launchd status
make service-logs      # tail service logs
make service-uninstall # remove launchd service

πŸ“„ License

MIT β€” see LICENSE.

About

🐩 Telegram bot wrapping gemini-cli β€” real-time streaming, voice transcription, file sharing & per-topic sessions. Built in Rust.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors