Operations

Runtime Services

Production deployments require:

Application container (Python backend)
PostgreSQL 16+
Redis 7+
Persistent storage for recordings
Public HTTPS URL with WebSocket support

Deployment

Docker Compose

bash

docker compose up --build

This starts PostgreSQL, Redis, and the backend. Run migrations before first use:

bash

docker compose exec app alembic upgrade head
docker compose exec app python scripts/seed.py

Manual Deployment

Provision PostgreSQL and Redis
Deploy the application container
Set all environment variables (see Configuration)
Run alembic upgrade head
Optionally seed: python scripts/seed.py
Ensure WebSocket traffic is routed correctly (no buffering, sticky sessions)

Dashboard

Build the dashboard and let the backend serve it:

bash

cd dashboard && npm install && npm run build

The built files in dashboard/dist/ are mounted at / by the FastAPI app.

Database Migrations

Migrations are managed by Alembic. The current migration chain:

Version	Description
001	Initial schema (calls, agents, events)
002	Agent service fields (STT/LLM/TTS config)
003	Agent provider settings (JSON)
004	Pipeline settings (VAD, turn detection)
005	Tools and flow nodes
006	Telephony settings (multi-provider)
007	Telephony concurrency limits
008	Flow node tool_ids
009	Agent pre-call tool IDs
010	Vobiz telephony provider
011	Agent context variables
012	Extended event types

Apply all migrations:

bash

alembic upgrade head

Check current version:

bash

alembic current

Logging

The backend uses structlog for structured logging. Key events:

Event	When
`call_started`	Pipeline begins
`call_connected`	WebSocket connected
`call_disconnected`	WebSocket closed
`call_completed`	Call finalized
`user_spoke`	User transcription captured
`agent_spoke`	Agent response generated
`tool_called`	Tool function invoked
`recording_saved`	WAV file written
`cleanup_completed`	Scheduled cleanup finished
`node_entered`	Flow engine entered a new node
`node_transition`	Flow engine transitioned between nodes

Set the log level via LOG_LEVEL environment variable.

Background Jobs

APScheduler starts automatically with the application:

Job	Schedule	Description
Recording cleanup	Daily at 3 AM	Deletes recordings older than `RECORDING_RETENTION_DAYS`
Stale call cleanup	Every 5 minutes	Marks calls stuck in `IN_PROGRESS` for >30 min as completed

Concurrency

WebSocket Sessions

Each active call holds one WebSocket connection and runs one Pipecat pipeline. Capacity depends on:

CPU (pipeline processing)
Network bandwidth (audio streaming)
Provider API rate limits (STT/LLM/TTS)
Telephony provider concurrency limits

Per-Provider Limits

Each telephony provider has a configurable max_concurrent setting (default: 10). Enforced at:

WebSocket accept (returns 503 if exceeded)
Outbound call API (returns error if exceeded)

Configure via PUT /api/settings/telephony.

Database Connections

The SQLAlchemy async pool defaults are suitable for moderate load. For higher concurrency:

Use managed PostgreSQL or PgBouncer
Monitor connection counts
Tune pool size in database.py

Scaling

Sticky Sessions

WebSocket calls are stateful. Production load balancers must pin WebSocket connections to a single instance. Configure session affinity or use a WebSocket-aware proxy.

Recordings

Recordings are written to local disk by default. Multi-instance deployments need shared storage (NFS, S3-backed FUSE, etc.) if recordings must be accessible across all instances.

Horizontal Scaling

Each call is independent — there is no cross-call shared state beyond the database. Adding instances scales call capacity linearly, provided:

Sticky sessions are configured
Recording storage is shared or per-instance
Database connection pool is sized appropriately

Recovery

Server Restart

On startup, the app marks any IN_PROGRESS calls as FAILED. This prevents stale records from a previous crash.

Stale Calls

The background job catches calls that somehow stay IN_PROGRESS for >30 minutes and marks them COMPLETED with an error message.

Troubleshooting

Calls fail immediately after backend starts

Verify telephony provider credentials
Verify AI service API keys (STT, LLM, TTS)
Check WebSocket reachability from the provider
Confirm the agent exists in the database

No recordings appear

Check RECORDINGS_DIR path and filesystem permissions
Verify the call reached the disconnect cleanup phase
Check backend logs for recording errors

Dashboard loads but shows empty data

Verify DASHBOARD_URL in CORS settings
Check database connectivity
Run python scripts/seed.py if no agents exist

Migrations fail

If the database is in a partially migrated state:

bash

# Check current state
alembic current

# If stuck, reset and rerun
alembic downgrade base
alembic upgrade head

For a fresh database, simply run alembic upgrade head.

Operations ​

Runtime Services ​

Deployment ​

Docker Compose ​

Manual Deployment ​

Dashboard ​

Database Migrations ​

Logging ​

Background Jobs ​

Concurrency ​

WebSocket Sessions ​

Per-Provider Limits ​

Database Connections ​

Scaling ​

Sticky Sessions ​

Recordings ​

Horizontal Scaling ​

Recovery ​

Server Restart ​

Stale Calls ​

Troubleshooting ​

Calls fail immediately after backend starts ​

No recordings appear ​

Dashboard loads but shows empty data ​

Migrations fail ​

Operations

Runtime Services

Deployment

Docker Compose

Manual Deployment

Dashboard

Database Migrations

Logging

Background Jobs

Concurrency

WebSocket Sessions

Per-Provider Limits

Database Connections

Scaling

Sticky Sessions

Recordings

Horizontal Scaling

Recovery

Server Restart

Stale Calls

Troubleshooting

Calls fail immediately after backend starts

No recordings appear

Dashboard loads but shows empty data

Migrations fail