Configuration

All configuration is loaded from environment variables via config/settings.py (Pydantic Settings). Copy .env.example to .env and fill in the values you need.

Telephony credentials can also be stored in the database via PUT /api/settings/telephony. Database values take precedence over environment variables.

Minimum required
Application
Data stores
Authentication
Email (SMTP)
Telephony providers
AI services
Recordings
Dashboard
Per-agent configuration

Minimum Required

Backend startup only

env

DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/voiceagent
REDIS_URL=redis://localhost:6379

Live calls (add to above)

env

PUBLIC_BASE_URL=https://your-public-url.com

# Telephony (pick one)
TELEPHONY_PROVIDER=twilio
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+15551234567

# STT (at least one)
DEEPGRAM_API_KEY=...

# LLM (at least one)
OPENAI_API_KEY=...

# TTS (at least one)
CARTESIA_API_KEY=...

Password reset (add to above)

env

JWT_SECRET_KEY=a-random-secret-at-least-32-chars
SMTP_HOST=smtp.gmail.com
SMTP_USER=you@yourdomain.com
SMTP_PASSWORD=your-app-password
SMTP_FROM_EMAIL=no-reply@yourdomain.com

Application

Variable	Default	Description
`APP_ENV`	`development`	`development` or `production` — affects error verbosity and logging
`APP_HOST`	`0.0.0.0`	Server bind address
`APP_PORT`	`8000`	Server port
`LOG_LEVEL`	`info`	`debug` · `info` · `warning` · `error`
`PUBLIC_BASE_URL`	`http://localhost:8000`	Publicly accessible HTTPS URL — used to build telephony webhook URLs and WebSocket stream URLs. Required for any telephony to work.

Data Stores

Variable	Default	Description
`DATABASE_URL`	`postgresql+asyncpg://postgres:postgres@localhost:5432/voiceagent`	PostgreSQL async connection string
`REDIS_URL`	`redis://localhost:6379/0`	Redis connection string — used for pub/sub and active call state

Authentication

Variable	Default	Description
`JWT_SECRET_KEY`	`change-me-in-production-use-a-real-secret`	Secret used to sign JWT tokens. Change this in production. Use a random string of at least 32 characters.
`JWT_EXPIRY_HOURS`	`24`	How long JWT tokens remain valid (hours)

Generate a secure secret:

bash

python3 -c "import secrets; print(secrets.token_hex(32))"

Email (SMTP)

Required for the forgot-password / password reset flow.

Variable	Default	Description
`SMTP_HOST`	—	SMTP server hostname (e.g. `smtp.gmail.com`)
`SMTP_PORT`	`587`	SMTP port (587 for STARTTLS, 465 for SSL)
`SMTP_USER`	—	SMTP username / login email
`SMTP_PASSWORD`	—	SMTP password or app-specific password
`SMTP_FROM_EMAIL`	—	Sender address shown on reset emails
`SMTP_FROM_NAME`	`Dariet`	Sender display name
`SMTP_USE_TLS`	`true`	Enable STARTTLS (`true`) or plain (`false`)

Gmail example:

env

SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=you@gmail.com
SMTP_PASSWORD=xxxx-xxxx-xxxx-xxxx   # App password, not account password
SMTP_FROM_EMAIL=you@gmail.com
SMTP_USE_TLS=true

If SMTP is not configured, the forgot-password endpoint returns 200 but no email is sent.

Telephony Providers

Global

Variable	Default	Description
`TELEPHONY_PROVIDER`	`twilio`	Default provider when not specified per-call: `twilio` · `exotel` · `vobiz`

Twilio

Variable	Description
`TWILIO_ACCOUNT_SID`	Account SID from the Twilio console (starts with `AC`)
`TWILIO_AUTH_TOKEN`	Auth token from the Twilio console
`TWILIO_PHONE_NUMBER`	Default caller ID for outbound calls (E.164)

Exotel

Variable	Default	Description
`EXOTEL_ACCOUNT_SID`	—	Account SID
`EXOTEL_API_KEY`	—	API key
`EXOTEL_API_TOKEN`	—	API token
`EXOTEL_PHONE_NUMBER`	—	Default virtual number (E.164)
`EXOTEL_API_BASE`	`api.in.exotel.com`	API cluster — `api.in.exotel.com` (Mumbai) or `api.exotel.com` (Singapore)

Vobiz

Variable	Description
`VOBIZ_AUTH_ID`	Auth ID
`VOBIZ_AUTH_TOKEN`	Auth token
`VOBIZ_PHONE_NUMBER`	Default phone number (E.164)

AI Services

Many providers share keys (e.g. OPENAI_API_KEY covers STT, LLM, and TTS via OpenAI). Only set keys for the providers you intend to use.

Speech-to-Text (STT)

Variable	Covers
`DEEPGRAM_API_KEY`	Deepgram STT (Nova models)
`SARVAM_API_KEY`	Sarvam STT + TTS (Saaras, Bulbul)
`ELEVENLABS_API_KEY`	ElevenLabs STT + TTS
`OPENAI_API_KEY`	OpenAI Whisper STT + LLM + TTS

Large Language Model (LLM)

Variable	Covers
`OPENAI_API_KEY`	OpenAI (GPT-4.1 family)
`GOOGLE_API_KEY`	Google Gemini
`XAI_API_KEY`	xAI Grok

Text-to-Speech (TTS)

Variable	Covers
`CARTESIA_API_KEY`	Cartesia Sonic
`ELEVENLABS_API_KEY`	ElevenLabs Flash / Turbo
`SARVAM_API_KEY`	Sarvam Bulbul
`DEEPGRAM_API_KEY`	Deepgram Aura
`OPENAI_API_KEY`	OpenAI TTS

Recordings

Variable	Default	Description
`RECORDINGS_DIR`	`./recordings`	Directory where WAV files are stored
`RECORDING_RETENTION_DAYS`	`30`	Days to keep recordings before auto-deletion. Set to `0` to disable deletion.

Dashboard

Variable	Default	Description
`DASHBOARD_URL`	`http://localhost:5173`	Dashboard origin — added to CORS allow-list
`VITE_API_URL`	`http://localhost:8000/api`	API base URL used by the Vite dev server (set in `dashboard/.env`)

Per-Agent Configuration

Each agent stores its own AI provider config in the database. These are set when creating or updating an agent via the API or dashboard.

STT options

Provider	`stt_provider`	Models (`stt_model`)	Notes
Deepgram	`deepgram`	`nova-3-general`, `nova-2-general`, `nova-2-phonecall`	Default. Best for English phone calls.
Sarvam	`sarvam`	`saaras:v3`, `saarika:v2.5`	Indian languages. Set `stt_mode` to `transcribe` or `translate`.
OpenAI	`openai`	`whisper-1`	General purpose. Higher latency.
ElevenLabs	`elevenlabs`	`scribe_v1`	Multilingual.

stt_language: BCP-47 language tag — en, hi-IN, ta-IN, te-IN, kn-IN, mr-IN, etc.

stt_mode (Sarvam only):

transcribe — returns text in the spoken language
translate — returns text translated to English

LLM options

Provider	`llm_provider`	Models (`llm_model`)	Notes
OpenAI	`openai`	`gpt-4.1-nano`, `gpt-4.1-mini`, `gpt-4.1`, `gpt-4o-mini`, `gpt-4o`	Default. `gpt-4.1-nano` is fastest and cheapest.
Google	`google`	`gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-1.5-flash`	Good multilingual support.
xAI	`grok`	`grok-3-beta`, `grok-3-mini-beta`

llm_settings:

json

{
  "temperature": 0.1,
  "max_tokens": 150
}

Keep max_tokens low (100–200) for voice — long responses feel unnatural on a call.

TTS options

Provider	`tts_provider`	Models (`tts_model`)	Notes
Cartesia	`cartesia`	`sonic-3`, `sonic-2`, `sonic-turbo`	Default. Lowest latency.
ElevenLabs	`elevenlabs`	`eleven_flash_v2_5`, `eleven_turbo_v2_5`	Natural voices, multilingual.
Sarvam	`sarvam`	`bulbul:v2`	Indian languages.
Deepgram	`deepgram`	`aura-2-en-us`, `aura-asteria-en`	English only.
OpenAI	`openai`	`tts-1`, `tts-1-hd`	Standard voices.

voice: Voice ID format varies by provider:

Cartesia — UUID: 694f9389-aac1-45b6-b726-9d9369183238
ElevenLabs — UUID or name: 21m00Tcm4TlvDq8ikWAM
Sarvam — speaker name: meera, arjun, amol
Deepgram — voice name: asteria, luna, stella
OpenAI — voice name: alloy, echo, fable, onyx, nova, shimmer

tts_settings examples:

json

// Cartesia — speed and emotion
{ "speed": 1.05, "emotion": ["positivity:high", "curiosity:medium"] }

// ElevenLabs — stability and similarity boost
{ "stability": 0.5, "similarity_boost": 0.75 }

Pipeline settings

Controls voice activity detection (VAD), turn detection, and idle timeouts.

Field	Default	Description
`idle_warn_secs`	`25`	Seconds of silence before the agent says something to re-engage the caller
`idle_end_secs`	`50`	Seconds of silence before the agent ends the call
`vad_confidence`	`0.5`	Silero VAD confidence threshold (0–1). Lower = more sensitive to speech.
`vad_start_secs`	`0.2`	Seconds of detected speech before transcription starts
`vad_stop_secs`	`0.2`	Seconds of silence after speech before the turn is submitted
`vad_min_volume`	`0.4`	Minimum RMS volume to count as speech (0–1). Filters background noise.
`min_words`	`3`	Minimum word count before the turn is sent to the LLM. Filters filler words.

json

{
  "pipeline_settings": {
    "idle_warn_secs": 20,
    "idle_end_secs": 40,
    "vad_confidence": 0.4,
    "vad_start_secs": 0.15,
    "vad_stop_secs": 0.3,
    "vad_min_volume": 0.3,
    "min_words": 2
  }
}

For noisy environments (call centres, outdoor), raise vad_min_volume and vad_confidence. For soft-spoken callers, lower them.

Configuration ​

Contents ​

Minimum Required ​

Backend startup only ​

Live calls (add to above) ​

Password reset (add to above) ​

Application ​

Data Stores ​

Authentication ​

Email (SMTP) ​

Telephony Providers ​

Global ​

Twilio ​

Exotel ​

Vobiz ​

AI Services ​

Speech-to-Text (STT) ​

Large Language Model (LLM) ​

Text-to-Speech (TTS) ​

Recordings ​

Dashboard ​

Per-Agent Configuration ​

STT options ​

LLM options ​

TTS options ​

Pipeline settings ​

Configuration

Contents

Minimum Required

Backend startup only

Live calls (add to above)

Password reset (add to above)

Application

Data Stores

Authentication

Email (SMTP)

Telephony Providers

Global

Twilio

Exotel

Vobiz

AI Services

Speech-to-Text (STT)

Large Language Model (LLM)

Text-to-Speech (TTS)

Recordings

Dashboard

Per-Agent Configuration

STT options

LLM options

TTS options

Pipeline settings