Skip to content

MarianodelRio/bots-platform

Repository files navigation

Conversational Bots Platform

Production-grade multi-tenant conversational AI platform — LLM-powered free-text understanding, declarative YAML state machines, one isolated container per client, pluggable connector framework with circuit breakers and retries.

Each client runs its own container from the same image. Configuration alone determines behavior — adding a new tenant requires zero code changes and zero redeployment.


Architecture: Two-Plane Design

┌──────────────────────────────────────────────────────┐
│                   CONTROL PLANE                       │
│              (shared VM — :8001)                      │
│                                                       │
│  Tenant & Identity · Flow Authoring · LLM Knowledge  │
│  Task Scheduler · Observability · Tenant Orchestrator │
│                                                       │
│              PostgreSQL (shared DB)                   │
└───────────────────────┬──────────────────────────────┘
                        │  boot-time config pull
             ┌──────────┼──────────┐
             ▼          ▼          ▼
        ┌─────────┐ ┌─────────┐ ┌─────────┐
        │Tenant A │ │Tenant B │ │Tenant C │
        │:8002    │ │:8003    │ │...      │
        │         │ │         │ │         │
        │Channel  │ │Channel  │ │Channel  │
        │Adapter  │ │Adapter  │ │Adapter  │
        │    ↓    │ │    ↓    │ │    ↓    │
        │Bot      │ │Bot      │ │Bot      │
        │Engine   │ │Engine   │ │Engine   │
        │  ↙   ↘  │ │  ↙   ↘  │ │  ↙   ↘  │
        │FSM   LLM│ │FSM   LLM│ │FSM   LLM│
        │    ↓    │ │    ↓    │ │    ↓    │
        │Connector│ │Connector│ │Connector│
        │Registry │ │Registry │ │Registry │
        │+ Sched. │ │+ Sched. │ │+ Sched. │
        └─────────┘ └─────────┘ └─────────┘
          DATA PLANE — same image, different config
Plane Instances Responsibility
Control Plane 1 shared Manages tenants, flows, credentials, LLM knowledge, scheduler visibility, observability
Data Plane 1 per client Receives messages, runs the FSM, invokes connectors, generates responses

Design decisions

Boot-time config pull. Containers are stateless relative to the Control Plane. At startup each container fetches its full configuration (flow, connector bindings, LLM knowledge, encrypted credentials) via an authenticated API call. No runtime coupling between planes — a CP outage doesn't take down live bots.

SQLite per container. Conversation state and the scheduler's job store live in a SQLite volume scoped to the container. Cross-tenant data access is impossible by construction.

Embedded scheduler. Task scheduling (reminders, recurring jobs) runs inside each Data Plane container via APScheduler + SQLite jobstore. No CP→container network round-trip per dispatch; jobs survive container restarts.

Hexagonal ports throughout. ChannelAdapter, ConnectorPort, StateStorePort, and SchedulerPort are abstract interfaces. Swapping WhatsApp for Telegram, or Google Calendar for Outlook, is an adapter swap — zero engine changes.


LLM Integration

The platform extends the deterministic FSM with LLM-powered free-text understanding as a first-class connector. When a user sends a message that matches no flow transition, the engine can delegate to an LLM before falling back to the default state — all without leaving the current state.

User message ──→ Bot Engine
                      │
              FSM transition match?
              ├─ YES ──→ execute transition  (deterministic, always first)
              └─ NO  ──→ llm_fallback enabled for this state?
                              ├─ YES ──→ LLMConnector.answer(question, knowledge, context)
                              │                    │
                              │          response text sent to user
                              │          current_state unchanged
                              └─ NO  ──→ FSM fallback state

Key design constraints:

  • LLM only generates text. It never transitions states, never executes connectors, never mutates conversation data. The FSM remains fully in control.
  • Knowledge in context, no vector DB. Each tenant uploads a versioned markdown document (business info, FAQs, hours, prices). Small-business knowledge fits entirely in the prompt — zero extra infrastructure.
  • Safe fallback. Out-of-scope questions get a canned "I don't have that information" response, never a hallucination.
  • Per-tenant cost control. Usage counters with configurable monthly token quotas; quota exceeded triggers the safe fallback.
  • Fully testable without credentials. MockLLMAdapter is scriptable per test scenario — make test passes with no API keys or network.

Knowledge management

# In the tenant's boot config (delivered by Control Plane)
llm:
  enabled: true
  provider: anthropic          # claude-haiku class — fast and cheap for FAQ
  knowledge_version: v3        # versioned in CP like flows; hot-swappable
  quota_monthly_tokens: 500000

# In any flow state
MENU:
  llm_fallback: true           # free-text messages that match no transition go to LLM
  fallback: MENU               # FSM fallback if LLM is disabled or quota exceeded
Message type Handler
Button press / list selection FSM transition (deterministic)
Recognized text payload FSM transition (deterministic)
Free text, in knowledge scope LLM → answer from knowledge doc
Free text, out of knowledge scope Safe fallback message
Free text, quota exceeded Safe fallback message

Stack

Component Choice
Language Python 3.14
Web framework FastAPI + uvicorn
Dependency manager uv
Control Plane DB PostgreSQL 16 + SQLAlchemy async + Alembic
Data Plane state SQLite (per-container volume)
Scheduler APScheduler 3.x + SQLite jobstore
Credential encryption Fernet (symmetric, key managed by CP)
LLM provider Anthropic (claude-haiku class)
Linter / formatter ruff
Type checking mypy
Tests pytest

Quick Start

Prerequisites

# uv (dependency manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

Local dev (no Docker)

# 1. Install dependencies
uv sync

# 2. Start Control Plane  →  http://localhost:8001
make run-control-plane

# 3. Start Data Plane with a dev tenant
TENANT_CONFIG_PATH=tests/configs/dev_tenant.yaml make run-data-plane
# →  http://localhost:8002

Full stack with Docker Compose

make up      # Control Plane + Data Plane + PostgreSQL
make down    # stop (keeps DB)
make clean   # stop and wipe DB
Service Port
Control Plane localhost:8001
Data Plane localhost:8002
PostgreSQL localhost:5433

Project Structure

.
├── control_plane/          # Control plane (platform management)
│   ├── main.py             # FastAPI app · /health
│   └── Dockerfile
│
├── data_plane/             # Data plane (per-tenant bot runtime)
│   ├── main.py             # FastAPI app · lifespan · /health
│   ├── config.py           # TenantConfig — loads YAML / CP boot payload
│   ├── engine/
│   │   ├── bot.py          # Bot — central orchestrator
│   │   ├── interpreter.py  # FSM interpreter: states, transitions, LLM fallback
│   │   ├── flow.py         # Flow dataclasses + YAML loader
│   │   ├── outputs.py      # Output types (Text, Buttons, List…)
│   │   └── degradation.py  # Output degradation by channel capabilities
│   ├── ports/
│   │   ├── channel_adapter.py  # ABC ChannelAdapter
│   │   ├── connector.py        # ABC ConnectorPort
│   │   ├── state_store.py      # ABC StateStorePort
│   │   └── scheduler.py        # ABC SchedulerPort
│   ├── adapters/
│   │   ├── channel/
│   │   │   ├── whatsapp.py     # WhatsAppAdapter (Cloud API v19, HMAC verified)
│   │   │   ├── http_dev.py     # HttpDevChannelAdapter (local dev)
│   │   │   └── factory.py      # channel_factory(config) → (adapter, router)
│   │   ├── connectors/
│   │   │   ├── google_calendar/  # GoogleCalendarAdapter
│   │   │   │   ├── adapter.py    # Implements CalendarConnector
│   │   │   │   ├── client.py     # Service account auth (thread-local)
│   │   │   │   ├── engine.py     # compute_slots() — pure function
│   │   │   │   ├── repository.py # Raw Calendar API fetches
│   │   │   │   ├── mutations.py  # create/cancel/confirm/mark events
│   │   │   │   ├── queries.py    # High-level reads
│   │   │   │   └── parser.py     # Description parsing + normalization
│   │   │   ├── llm/
│   │   │   │   ├── adapter.py    # AnthropicLLMAdapter (implements LLMConnector)
│   │   │   │   └── mock.py       # MockLLMAdapter (scriptable for tests)
│   │   │   └── mock_calendar.py  # MockCalendarAdapter (tests)
│   │   ├── scheduler/
│   │   │   ├── apscheduler.py    # APSchedulerAdapter (SQLite jobstore)
│   │   │   └── noop.py           # NoopScheduler (tests)
│   │   └── state_store/
│   │       ├── sqlite.py         # SQLiteStateStore (production)
│   │       └── in_memory.py      # InMemoryStateStore (tests)
│   └── connectors/
│       ├── registry.py           # ConnectorRegistry: category → implementation
│       ├── circuit_breaker.py    # Circuit breaker: CLOSED / OPEN / HALF_OPEN
│       └── categories/
│           ├── calendar.py       # CalendarConnector ABC
│           ├── llm.py            # LLMConnector ABC
│           └── notification.py   # NotificationConnector ABC
│
├── shared/
│   ├── domain/
│   │   ├── messages.py     # InternalMessage (channel-agnostic)
│   │   └── conversation.py # ConversationState
│   └── connectors/
│       └── schemas/        # JSON Schemas for connector configs (autodiscovery)
│
├── flows/
│   └── peluqueria_flow.yaml   # Example flow: hairdresser appointment booking
│
├── tests/
│   ├── configs/            # Tenant YAMLs for tests
│   └── flows/              # Toy flows for tests
│
├── docker-compose.yml
├── Makefile
└── pyproject.toml

Declarative YAML Flows

The bot executes a state graph defined in YAML. The same generic engine runs any flow for any client — the flow is data, not code.

id: my_flow_v1
initial_state: MENU

global_transitions:
  - on_payload: "back_to_menu"
    target: MENU

states:
  MENU:
    llm_fallback: true        # free-text questions answered from business knowledge
    on_enter:
      - action: send_interactive_buttons
        body: "Hi! What would you like to do?"
        buttons:
          - id: "menu_book"
            title: "Book appointment"
          - id: "menu_cancel"
            title: "Cancel appointment"
    transitions:
      - on_payload: "menu_book"
        target: BOOK_SELECT_SERVICE
      - on_payload: "menu_cancel"
        target: CANCEL_SELECT
    fallback: MENU

  BOOK_SELECT_SERVICE:
    on_enter:
      - action: invoke_connector
        connector: calendar
        operation: get_available_days
        params:
          from_date: "{{data.today}}"
          lookahead_days: 14
        result_key: available_days
      - action: send_dynamic_options
        source_key: "available_days"
        text: "Which day works for you?"
        empty_text: "No days available right now."
    transitions:
      - on_payload_prefix: "day_"
        extract_suffix_as: "selected_date"
        target: BOOK_CONFIRM
    fallback: MENU

Available actions

Action Description
send_text Plain text message
send_interactive_buttons Up to 3 buttons (WhatsApp)
send_interactive_list Option list (up to 10 rows on WhatsApp)
send_dynamic_options List built dynamically from a connector result
invoke_connector Call an external connector (calendar, LLM, etc.)
schedule_task Schedule a one-off or recurring task
cancel_task Cancel a previously scheduled task by idempotency key

Transitions

Field Effect
on_payload Exact match on button/list payload
on_payload_prefix Payload starts with prefix; extract_suffix_as stores suffix in data
on_type Message type (text, button, list)
condition Expression over data (e.g. "data.selected_service")
set_data Write values into data before entering the next state

Tenant Configuration

Each Data Plane container loads its configuration from the Control Plane at boot (or from a local YAML pointed to by TENANT_CONFIG_PATH in dev).

# tests/configs/dev_tenant.yaml
tenant_id: my_business
flow_path: flows/peluqueria_flow.yaml

channel:
  type: http_dev          # http_dev | whatsapp

connectors:
  calendar:
    type: mock            # mock | mock_calendar | google_calendar
  llm:
    type: mock_llm        # mock_llm | anthropic

WhatsApp channel

channel:
  type: whatsapp
  phone_number_id: "1234567890"
  access_token: "EAAxxxxxxx"
  app_secret: "abc123"
  verify_token: "my_verify_token"

Google Calendar connector

connectors:
  calendar:
    type: google_calendar
    credentials_path: "/path/to/service_account.json"
    calendar_id: "business@group.calendar.google.com"
    timezone: "Europe/Madrid"
    slot_duration_min: 30
    lookahead_days_client: 14
    lookahead_days_manual: 60
    schedule:
      mon: ["10:00-14:00", "17:00-21:00"]
      tue: ["10:00-14:00", "17:00-21:00"]
      wed: ["10:00-14:00", "17:00-21:00"]
      thu: ["10:00-14:00", "17:00-21:00"]
      fri: ["10:00-14:00", "17:00-21:00"]
      sat: ["10:00-14:00"]

See tests/configs/google_calendar_tenant.yaml.example for the full template.


Channels

Channel Class Use
WhatsApp Cloud API WhatsAppAdapter Production — HMAC verified, buttons, lists, templates
HTTP Dev HttpDevChannelAdapter Local dev — POST /inbound + GET /messages (pull model)

Testing the bot locally

# Start the data plane with the http_dev channel
TENANT_CONFIG_PATH=tests/configs/dev_tenant.yaml make run-data-plane

# Send a message
curl -X POST http://localhost:8002/inbound \
  -H "Content-Type: application/json" \
  -d '{"contact_id": "user_1", "text": "hello"}'

# Read the bot's responses
curl http://localhost:8002/messages

Connector Framework

Flows reference connectors by category (connector: calendar); the tenant config selects the concrete implementation (type: google_calendar). The ConnectorRegistry resolves the mapping at runtime — the engine knows nothing about specific adapters.

Cross-cutting concerns applied centrally (not inside each connector):

  • Exponential backoff retries (tenacity)
  • Circuit breaker: CLOSED → OPEN → HALF_OPEN
  • Configurable timeouts
  • Structured logging
Connector Category Config type Status
GoogleCalendarAdapter CalendarConnector google_calendar Implemented
AnthropicLLMAdapter LLMConnector anthropic Implemented
MockCalendarAdapter CalendarConnector mock_calendar Tests
MockLLMAdapter LLMConnector mock_llm Tests
MockConnector Generic mock Tests / dev

Task Scheduler

Each container runs an embedded APScheduler instance with a SQLite jobstore (in the container volume):

  • Scheduled jobs survive container restarts
  • The YAML flow defines tasks: (ad-hoc) and standing_schedules: (registered at boot)
  • Flow actions: schedule_task / cancel_task with idempotency keys
  • Dispatch: APScheduler → POST /run-task (internal) → TaskExecutor → Channel Adapter
  • 5-minute misfire grace window
tasks:
  reminder:
    - action: send_text
      text: "Reminder: your appointment is tomorrow at {{data.time}}."

# In a booking state:
on_enter:
  - action: schedule_task
    task: reminder
    idempotency_key: "reminder:{{data.event_id}}"
    run_at: "{{data.reminder_time}}"

Commands

make run-control-plane   # Start Control Plane on localhost:8001
make run-data-plane      # Start Data Plane on localhost:8002 (requires TENANT_CONFIG_PATH)
make up                  # Full stack via Docker Compose
make down                # Stop containers (keeps DB)
make clean               # Stop containers and wipe DB (down -v)
make test                # Run all tests
make lint                # ruff check + mypy
make format              # ruff format

Tests

make test                                              # all tests
uv run pytest tests/test_bot_engine.py -v             # single file
uv run pytest -k "peluqueria" -v                      # by name

No credentials or network required. All external APIs (Calendar, WhatsApp, LLM) are mocked.

About

Production-grade multi-tenant conversational AI platform — LLM-powered NLU, declarative YAML state machines, one isolated container per client, pluggable connector framework.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages