Appearance
Configuration
All configuration is loaded from environment variables via config/settings.py (Pydantic Settings). Copy .env.example to .env and fill in the values you need.
Telephony credentials can also be stored in the database via
PUT /api/settings/telephony. Database values take precedence over environment variables.
Contents
- Minimum required
- Application
- Data stores
- Authentication
- Email (SMTP)
- Telephony providers
- AI services
- Recordings
- Dashboard
- Per-agent configuration
Minimum Required
Backend startup only
env
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/voiceagent
REDIS_URL=redis://localhost:6379Live calls (add to above)
env
PUBLIC_BASE_URL=https://your-public-url.com
# Telephony (pick one)
TELEPHONY_PROVIDER=twilio
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+15551234567
# STT (at least one)
DEEPGRAM_API_KEY=...
# LLM (at least one)
OPENAI_API_KEY=...
# TTS (at least one)
CARTESIA_API_KEY=...Password reset (add to above)
env
JWT_SECRET_KEY=a-random-secret-at-least-32-chars
SMTP_HOST=smtp.gmail.com
SMTP_USER=you@yourdomain.com
SMTP_PASSWORD=your-app-password
SMTP_FROM_EMAIL=no-reply@yourdomain.comApplication
| Variable | Default | Description |
|---|---|---|
APP_ENV | development | development or production — affects error verbosity and logging |
APP_HOST | 0.0.0.0 | Server bind address |
APP_PORT | 8000 | Server port |
LOG_LEVEL | info | debug · info · warning · error |
PUBLIC_BASE_URL | http://localhost:8000 | Publicly accessible HTTPS URL — used to build telephony webhook URLs and WebSocket stream URLs. Required for any telephony to work. |
Data Stores
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | postgresql+asyncpg://postgres:postgres@localhost:5432/voiceagent | PostgreSQL async connection string |
REDIS_URL | redis://localhost:6379/0 | Redis connection string — used for pub/sub and active call state |
Authentication
| Variable | Default | Description |
|---|---|---|
JWT_SECRET_KEY | change-me-in-production-use-a-real-secret | Secret used to sign JWT tokens. Change this in production. Use a random string of at least 32 characters. |
JWT_EXPIRY_HOURS | 24 | How long JWT tokens remain valid (hours) |
Generate a secure secret:
bash
python3 -c "import secrets; print(secrets.token_hex(32))"Email (SMTP)
Required for the forgot-password / password reset flow.
| Variable | Default | Description |
|---|---|---|
SMTP_HOST | — | SMTP server hostname (e.g. smtp.gmail.com) |
SMTP_PORT | 587 | SMTP port (587 for STARTTLS, 465 for SSL) |
SMTP_USER | — | SMTP username / login email |
SMTP_PASSWORD | — | SMTP password or app-specific password |
SMTP_FROM_EMAIL | — | Sender address shown on reset emails |
SMTP_FROM_NAME | Dariet | Sender display name |
SMTP_USE_TLS | true | Enable STARTTLS (true) or plain (false) |
Gmail example:
env
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=you@gmail.com
SMTP_PASSWORD=xxxx-xxxx-xxxx-xxxx # App password, not account password
SMTP_FROM_EMAIL=you@gmail.com
SMTP_USE_TLS=trueIf SMTP is not configured, the forgot-password endpoint returns 200 but no email is sent.
Telephony Providers
Global
| Variable | Default | Description |
|---|---|---|
TELEPHONY_PROVIDER | twilio | Default provider when not specified per-call: twilio · exotel · vobiz |
Twilio
| Variable | Description |
|---|---|
TWILIO_ACCOUNT_SID | Account SID from the Twilio console (starts with AC) |
TWILIO_AUTH_TOKEN | Auth token from the Twilio console |
TWILIO_PHONE_NUMBER | Default caller ID for outbound calls (E.164) |
Exotel
| Variable | Default | Description |
|---|---|---|
EXOTEL_ACCOUNT_SID | — | Account SID |
EXOTEL_API_KEY | — | API key |
EXOTEL_API_TOKEN | — | API token |
EXOTEL_PHONE_NUMBER | — | Default virtual number (E.164) |
EXOTEL_API_BASE | api.in.exotel.com | API cluster — api.in.exotel.com (Mumbai) or api.exotel.com (Singapore) |
Vobiz
| Variable | Description |
|---|---|
VOBIZ_AUTH_ID | Auth ID |
VOBIZ_AUTH_TOKEN | Auth token |
VOBIZ_PHONE_NUMBER | Default phone number (E.164) |
AI Services
Many providers share keys (e.g. OPENAI_API_KEY covers STT, LLM, and TTS via OpenAI). Only set keys for the providers you intend to use.
Speech-to-Text (STT)
| Variable | Covers |
|---|---|
DEEPGRAM_API_KEY | Deepgram STT (Nova models) |
SARVAM_API_KEY | Sarvam STT + TTS (Saaras, Bulbul) |
ELEVENLABS_API_KEY | ElevenLabs STT + TTS |
OPENAI_API_KEY | OpenAI Whisper STT + LLM + TTS |
Large Language Model (LLM)
| Variable | Covers |
|---|---|
OPENAI_API_KEY | OpenAI (GPT-4.1 family) |
GOOGLE_API_KEY | Google Gemini |
XAI_API_KEY | xAI Grok |
Text-to-Speech (TTS)
| Variable | Covers |
|---|---|
CARTESIA_API_KEY | Cartesia Sonic |
ELEVENLABS_API_KEY | ElevenLabs Flash / Turbo |
SARVAM_API_KEY | Sarvam Bulbul |
DEEPGRAM_API_KEY | Deepgram Aura |
OPENAI_API_KEY | OpenAI TTS |
Recordings
| Variable | Default | Description |
|---|---|---|
RECORDINGS_DIR | ./recordings | Directory where WAV files are stored |
RECORDING_RETENTION_DAYS | 30 | Days to keep recordings before auto-deletion. Set to 0 to disable deletion. |
Dashboard
| Variable | Default | Description |
|---|---|---|
DASHBOARD_URL | http://localhost:5173 | Dashboard origin — added to CORS allow-list |
VITE_API_URL | http://localhost:8000/api | API base URL used by the Vite dev server (set in dashboard/.env) |
Per-Agent Configuration
Each agent stores its own AI provider config in the database. These are set when creating or updating an agent via the API or dashboard.
STT options
| Provider | stt_provider | Models (stt_model) | Notes |
|---|---|---|---|
| Deepgram | deepgram | nova-3-general, nova-2-general, nova-2-phonecall | Default. Best for English phone calls. |
| Sarvam | sarvam | saaras:v3, saarika:v2.5 | Indian languages. Set stt_mode to transcribe or translate. |
| OpenAI | openai | whisper-1 | General purpose. Higher latency. |
| ElevenLabs | elevenlabs | scribe_v1 | Multilingual. |
stt_language: BCP-47 language tag — en, hi-IN, ta-IN, te-IN, kn-IN, mr-IN, etc.
stt_mode (Sarvam only):
transcribe— returns text in the spoken languagetranslate— returns text translated to English
LLM options
| Provider | llm_provider | Models (llm_model) | Notes |
|---|---|---|---|
| OpenAI | openai | gpt-4.1-nano, gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4o | Default. gpt-4.1-nano is fastest and cheapest. |
google | gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-flash | Good multilingual support. | |
| xAI | grok | grok-3-beta, grok-3-mini-beta |
llm_settings:
json
{
"temperature": 0.1,
"max_tokens": 150
}Keep max_tokens low (100–200) for voice — long responses feel unnatural on a call.
TTS options
| Provider | tts_provider | Models (tts_model) | Notes |
|---|---|---|---|
| Cartesia | cartesia | sonic-3, sonic-2, sonic-turbo | Default. Lowest latency. |
| ElevenLabs | elevenlabs | eleven_flash_v2_5, eleven_turbo_v2_5 | Natural voices, multilingual. |
| Sarvam | sarvam | bulbul:v2 | Indian languages. |
| Deepgram | deepgram | aura-2-en-us, aura-asteria-en | English only. |
| OpenAI | openai | tts-1, tts-1-hd | Standard voices. |
voice: Voice ID format varies by provider:
- Cartesia — UUID:
694f9389-aac1-45b6-b726-9d9369183238 - ElevenLabs — UUID or name:
21m00Tcm4TlvDq8ikWAM - Sarvam — speaker name:
meera,arjun,amol - Deepgram — voice name:
asteria,luna,stella - OpenAI — voice name:
alloy,echo,fable,onyx,nova,shimmer
tts_settings examples:
json
// Cartesia — speed and emotion
{ "speed": 1.05, "emotion": ["positivity:high", "curiosity:medium"] }
// ElevenLabs — stability and similarity boost
{ "stability": 0.5, "similarity_boost": 0.75 }Pipeline settings
Controls voice activity detection (VAD), turn detection, and idle timeouts.
| Field | Default | Description |
|---|---|---|
idle_warn_secs | 25 | Seconds of silence before the agent says something to re-engage the caller |
idle_end_secs | 50 | Seconds of silence before the agent ends the call |
vad_confidence | 0.5 | Silero VAD confidence threshold (0–1). Lower = more sensitive to speech. |
vad_start_secs | 0.2 | Seconds of detected speech before transcription starts |
vad_stop_secs | 0.2 | Seconds of silence after speech before the turn is submitted |
vad_min_volume | 0.4 | Minimum RMS volume to count as speech (0–1). Filters background noise. |
min_words | 3 | Minimum word count before the turn is sent to the LLM. Filters filler words. |
json
{
"pipeline_settings": {
"idle_warn_secs": 20,
"idle_end_secs": 40,
"vad_confidence": 0.4,
"vad_start_secs": 0.15,
"vad_stop_secs": 0.3,
"vad_min_volume": 0.3,
"min_words": 2
}
}For noisy environments (call centres, outdoor), raise
vad_min_volumeandvad_confidence. For soft-spoken callers, lower them.