Skip to content

Configuration

All configuration is loaded from environment variables via config/settings.py (Pydantic Settings). Copy .env.example to .env and fill in the values you need.

Telephony credentials can also be stored in the database via PUT /api/settings/telephony. Database values take precedence over environment variables.


Contents


Minimum Required

Backend startup only

env
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/voiceagent
REDIS_URL=redis://localhost:6379

Live calls (add to above)

env
PUBLIC_BASE_URL=https://your-public-url.com

# Telephony (pick one)
TELEPHONY_PROVIDER=twilio
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+15551234567

# STT (at least one)
DEEPGRAM_API_KEY=...

# LLM (at least one)
OPENAI_API_KEY=...

# TTS (at least one)
CARTESIA_API_KEY=...

Password reset (add to above)

env
JWT_SECRET_KEY=a-random-secret-at-least-32-chars
SMTP_HOST=smtp.gmail.com
SMTP_USER=you@yourdomain.com
SMTP_PASSWORD=your-app-password
SMTP_FROM_EMAIL=no-reply@yourdomain.com

Application

VariableDefaultDescription
APP_ENVdevelopmentdevelopment or production — affects error verbosity and logging
APP_HOST0.0.0.0Server bind address
APP_PORT8000Server port
LOG_LEVELinfodebug · info · warning · error
PUBLIC_BASE_URLhttp://localhost:8000Publicly accessible HTTPS URL — used to build telephony webhook URLs and WebSocket stream URLs. Required for any telephony to work.

Data Stores

VariableDefaultDescription
DATABASE_URLpostgresql+asyncpg://postgres:postgres@localhost:5432/voiceagentPostgreSQL async connection string
REDIS_URLredis://localhost:6379/0Redis connection string — used for pub/sub and active call state

Authentication

VariableDefaultDescription
JWT_SECRET_KEYchange-me-in-production-use-a-real-secretSecret used to sign JWT tokens. Change this in production. Use a random string of at least 32 characters.
JWT_EXPIRY_HOURS24How long JWT tokens remain valid (hours)

Generate a secure secret:

bash
python3 -c "import secrets; print(secrets.token_hex(32))"

Email (SMTP)

Required for the forgot-password / password reset flow.

VariableDefaultDescription
SMTP_HOSTSMTP server hostname (e.g. smtp.gmail.com)
SMTP_PORT587SMTP port (587 for STARTTLS, 465 for SSL)
SMTP_USERSMTP username / login email
SMTP_PASSWORDSMTP password or app-specific password
SMTP_FROM_EMAILSender address shown on reset emails
SMTP_FROM_NAMEDarietSender display name
SMTP_USE_TLStrueEnable STARTTLS (true) or plain (false)

Gmail example:

env
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=you@gmail.com
SMTP_PASSWORD=xxxx-xxxx-xxxx-xxxx   # App password, not account password
SMTP_FROM_EMAIL=you@gmail.com
SMTP_USE_TLS=true

If SMTP is not configured, the forgot-password endpoint returns 200 but no email is sent.


Telephony Providers

Global

VariableDefaultDescription
TELEPHONY_PROVIDERtwilioDefault provider when not specified per-call: twilio · exotel · vobiz

Twilio

VariableDescription
TWILIO_ACCOUNT_SIDAccount SID from the Twilio console (starts with AC)
TWILIO_AUTH_TOKENAuth token from the Twilio console
TWILIO_PHONE_NUMBERDefault caller ID for outbound calls (E.164)

Exotel

VariableDefaultDescription
EXOTEL_ACCOUNT_SIDAccount SID
EXOTEL_API_KEYAPI key
EXOTEL_API_TOKENAPI token
EXOTEL_PHONE_NUMBERDefault virtual number (E.164)
EXOTEL_API_BASEapi.in.exotel.comAPI cluster — api.in.exotel.com (Mumbai) or api.exotel.com (Singapore)

Vobiz

VariableDescription
VOBIZ_AUTH_IDAuth ID
VOBIZ_AUTH_TOKENAuth token
VOBIZ_PHONE_NUMBERDefault phone number (E.164)

AI Services

Many providers share keys (e.g. OPENAI_API_KEY covers STT, LLM, and TTS via OpenAI). Only set keys for the providers you intend to use.

Speech-to-Text (STT)

VariableCovers
DEEPGRAM_API_KEYDeepgram STT (Nova models)
SARVAM_API_KEYSarvam STT + TTS (Saaras, Bulbul)
ELEVENLABS_API_KEYElevenLabs STT + TTS
OPENAI_API_KEYOpenAI Whisper STT + LLM + TTS

Large Language Model (LLM)

VariableCovers
OPENAI_API_KEYOpenAI (GPT-4.1 family)
GOOGLE_API_KEYGoogle Gemini
XAI_API_KEYxAI Grok

Text-to-Speech (TTS)

VariableCovers
CARTESIA_API_KEYCartesia Sonic
ELEVENLABS_API_KEYElevenLabs Flash / Turbo
SARVAM_API_KEYSarvam Bulbul
DEEPGRAM_API_KEYDeepgram Aura
OPENAI_API_KEYOpenAI TTS

Recordings

VariableDefaultDescription
RECORDINGS_DIR./recordingsDirectory where WAV files are stored
RECORDING_RETENTION_DAYS30Days to keep recordings before auto-deletion. Set to 0 to disable deletion.

Dashboard

VariableDefaultDescription
DASHBOARD_URLhttp://localhost:5173Dashboard origin — added to CORS allow-list
VITE_API_URLhttp://localhost:8000/apiAPI base URL used by the Vite dev server (set in dashboard/.env)

Per-Agent Configuration

Each agent stores its own AI provider config in the database. These are set when creating or updating an agent via the API or dashboard.

STT options

Providerstt_providerModels (stt_model)Notes
Deepgramdeepgramnova-3-general, nova-2-general, nova-2-phonecallDefault. Best for English phone calls.
Sarvamsarvamsaaras:v3, saarika:v2.5Indian languages. Set stt_mode to transcribe or translate.
OpenAIopenaiwhisper-1General purpose. Higher latency.
ElevenLabselevenlabsscribe_v1Multilingual.

stt_language: BCP-47 language tag — en, hi-IN, ta-IN, te-IN, kn-IN, mr-IN, etc.

stt_mode (Sarvam only):

  • transcribe — returns text in the spoken language
  • translate — returns text translated to English

LLM options

Providerllm_providerModels (llm_model)Notes
OpenAIopenaigpt-4.1-nano, gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4oDefault. gpt-4.1-nano is fastest and cheapest.
Googlegooglegemini-2.5-flash, gemini-2.0-flash, gemini-1.5-flashGood multilingual support.
xAIgrokgrok-3-beta, grok-3-mini-beta

llm_settings:

json
{
  "temperature": 0.1,
  "max_tokens": 150
}

Keep max_tokens low (100–200) for voice — long responses feel unnatural on a call.

TTS options

Providertts_providerModels (tts_model)Notes
Cartesiacartesiasonic-3, sonic-2, sonic-turboDefault. Lowest latency.
ElevenLabselevenlabseleven_flash_v2_5, eleven_turbo_v2_5Natural voices, multilingual.
Sarvamsarvambulbul:v2Indian languages.
Deepgramdeepgramaura-2-en-us, aura-asteria-enEnglish only.
OpenAIopenaitts-1, tts-1-hdStandard voices.

voice: Voice ID format varies by provider:

  • Cartesia — UUID: 694f9389-aac1-45b6-b726-9d9369183238
  • ElevenLabs — UUID or name: 21m00Tcm4TlvDq8ikWAM
  • Sarvam — speaker name: meera, arjun, amol
  • Deepgram — voice name: asteria, luna, stella
  • OpenAI — voice name: alloy, echo, fable, onyx, nova, shimmer

tts_settings examples:

json
// Cartesia — speed and emotion
{ "speed": 1.05, "emotion": ["positivity:high", "curiosity:medium"] }

// ElevenLabs — stability and similarity boost
{ "stability": 0.5, "similarity_boost": 0.75 }

Pipeline settings

Controls voice activity detection (VAD), turn detection, and idle timeouts.

FieldDefaultDescription
idle_warn_secs25Seconds of silence before the agent says something to re-engage the caller
idle_end_secs50Seconds of silence before the agent ends the call
vad_confidence0.5Silero VAD confidence threshold (0–1). Lower = more sensitive to speech.
vad_start_secs0.2Seconds of detected speech before transcription starts
vad_stop_secs0.2Seconds of silence after speech before the turn is submitted
vad_min_volume0.4Minimum RMS volume to count as speech (0–1). Filters background noise.
min_words3Minimum word count before the turn is sent to the LLM. Filters filler words.
json
{
  "pipeline_settings": {
    "idle_warn_secs": 20,
    "idle_end_secs": 40,
    "vad_confidence": 0.4,
    "vad_start_secs": 0.15,
    "vad_stop_secs": 0.3,
    "vad_min_volume": 0.3,
    "min_words": 2
  }
}

For noisy environments (call centres, outdoor), raise vad_min_volume and vad_confidence. For soft-spoken callers, lower them.