Skip to content

Operations

Runtime Services

Production deployments require:

  • Application container (Python backend)
  • PostgreSQL 16+
  • Redis 7+
  • Persistent storage for recordings
  • Public HTTPS URL with WebSocket support

Deployment

Docker Compose

bash
docker compose up --build

This starts PostgreSQL, Redis, and the backend. Run migrations before first use:

bash
docker compose exec app alembic upgrade head
docker compose exec app python scripts/seed.py

Manual Deployment

  1. Provision PostgreSQL and Redis
  2. Deploy the application container
  3. Set all environment variables (see Configuration)
  4. Run alembic upgrade head
  5. Optionally seed: python scripts/seed.py
  6. Ensure WebSocket traffic is routed correctly (no buffering, sticky sessions)

Dashboard

Build the dashboard and let the backend serve it:

bash
cd dashboard && npm install && npm run build

The built files in dashboard/dist/ are mounted at / by the FastAPI app.

Database Migrations

Migrations are managed by Alembic. The current migration chain:

VersionDescription
001Initial schema (calls, agents, events)
002Agent service fields (STT/LLM/TTS config)
003Agent provider settings (JSON)
004Pipeline settings (VAD, turn detection)
005Tools and flow nodes
006Telephony settings (multi-provider)
007Telephony concurrency limits
008Flow node tool_ids
009Agent pre-call tool IDs
010Vobiz telephony provider
011Agent context variables
012Extended event types

Apply all migrations:

bash
alembic upgrade head

Check current version:

bash
alembic current

Logging

The backend uses structlog for structured logging. Key events:

EventWhen
call_startedPipeline begins
call_connectedWebSocket connected
call_disconnectedWebSocket closed
call_completedCall finalized
user_spokeUser transcription captured
agent_spokeAgent response generated
tool_calledTool function invoked
recording_savedWAV file written
cleanup_completedScheduled cleanup finished
node_enteredFlow engine entered a new node
node_transitionFlow engine transitioned between nodes

Set the log level via LOG_LEVEL environment variable.

Background Jobs

APScheduler starts automatically with the application:

JobScheduleDescription
Recording cleanupDaily at 3 AMDeletes recordings older than RECORDING_RETENTION_DAYS
Stale call cleanupEvery 5 minutesMarks calls stuck in IN_PROGRESS for >30 min as completed

Concurrency

WebSocket Sessions

Each active call holds one WebSocket connection and runs one Pipecat pipeline. Capacity depends on:

  • CPU (pipeline processing)
  • Network bandwidth (audio streaming)
  • Provider API rate limits (STT/LLM/TTS)
  • Telephony provider concurrency limits

Per-Provider Limits

Each telephony provider has a configurable max_concurrent setting (default: 10). Enforced at:

  • WebSocket accept (returns 503 if exceeded)
  • Outbound call API (returns error if exceeded)

Configure via PUT /api/settings/telephony.

Database Connections

The SQLAlchemy async pool defaults are suitable for moderate load. For higher concurrency:

  • Use managed PostgreSQL or PgBouncer
  • Monitor connection counts
  • Tune pool size in database.py

Scaling

Sticky Sessions

WebSocket calls are stateful. Production load balancers must pin WebSocket connections to a single instance. Configure session affinity or use a WebSocket-aware proxy.

Recordings

Recordings are written to local disk by default. Multi-instance deployments need shared storage (NFS, S3-backed FUSE, etc.) if recordings must be accessible across all instances.

Horizontal Scaling

Each call is independent — there is no cross-call shared state beyond the database. Adding instances scales call capacity linearly, provided:

  • Sticky sessions are configured
  • Recording storage is shared or per-instance
  • Database connection pool is sized appropriately

Recovery

Server Restart

On startup, the app marks any IN_PROGRESS calls as FAILED. This prevents stale records from a previous crash.

Stale Calls

The background job catches calls that somehow stay IN_PROGRESS for >30 minutes and marks them COMPLETED with an error message.

Troubleshooting

Calls fail immediately after backend starts

  • Verify telephony provider credentials
  • Verify AI service API keys (STT, LLM, TTS)
  • Check WebSocket reachability from the provider
  • Confirm the agent exists in the database

No recordings appear

  • Check RECORDINGS_DIR path and filesystem permissions
  • Verify the call reached the disconnect cleanup phase
  • Check backend logs for recording errors

Dashboard loads but shows empty data

  • Verify DASHBOARD_URL in CORS settings
  • Check database connectivity
  • Run python scripts/seed.py if no agents exist

Migrations fail

If the database is in a partially migrated state:

bash
# Check current state
alembic current

# If stuck, reset and rerun
alembic downgrade base
alembic upgrade head

For a fresh database, simply run alembic upgrade head.