Agent Creation Guide

Agents are the core of the platform. Each agent handles inbound or outbound calls with its own voice, persona, and conversation logic.

Two Modes

	Freeform	Flow
Structure	Single system prompt	Multiple nodes, each with its own prompt
Tools	Available throughout the call	Per-node
Control	LLM decides everything	Structured, step-by-step
Best for	Open-ended conversations	Scripted, multi-stage conversations

Use freeform when: The caller can ask anything at any time — support, FAQ, lookup.

Use flows when: The conversation has stages — qualify → book → confirm, or greet → verify → offer → close.

Doc	What it covers
Freeform Agents	Simple agents, multilingual, inline tools
Flow Agents	Multi-node flows, branching, escape hatches
Tools	Webhook tools, pre-call tools, pre-actions
Outbound Calls	Context variables, outbound dialling
Objection Handling	Call me later, not interested, transfer to human, DNC
JSON Import & AI Creation	Full JSON schema, AI prompt template, common mistakes

Quick Start

Minimal freeform agent (API):

bash

curl -X POST https://your-domain.com/api/agents \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-agent",
    "prompt": "You are Maya, a friendly support agent for Acme Corp. Help customers with order queries.",
    "greeting": "Hi, this is Maya from Acme. How can I help you today?"
  }'

Minimal freeform agent (import JSON):

json

{
  "version": "1",
  "agent": {
    "name": "support-agent",
    "prompt": "You are Maya, a friendly support agent for Acme Corp.",
    "greeting": "Hi, this is Maya from Acme. How can I help you today?"
  },
  "tools": [],
  "flow_nodes": []
}

Upload via Dashboard → Agents → ↑ Import, or POST /api/agents/import.

Pipeline at a Glance

Every call goes through:

Audio In → VAD → STT → LLM → TTS → Audio Out

VAD — detects when the caller starts and stops speaking
STT — converts speech to text (Deepgram, Sarvam, OpenAI, ElevenLabs)
LLM — generates the response (OpenAI, Google, Grok)
TTS — converts text to speech (Cartesia, ElevenLabs, Sarvam, Deepgram, OpenAI)

Default stack: deepgram nova-3-general + gpt-4.1-nano + cartesia sonic-3.

Auto-Injected Rules

The platform appends these to every agent's system prompt automatically — you do not need to add them:

Max 2 sentences per response (unless the caller asks for detail)
No markdown, bullet points, or special characters
Spell out numbers in full
No filler openers ("Certainly!", "Of course!", etc.)
Call end_call when the conversation is complete

Auto-Injected Call Metadata

The LLM always receives:

Call info: This is an inbound call. Caller phone: +919876543210.

For outbound calls with context variables:

Call context:
- customer_name: Ravi Kumar
- invoice_amount: 4500

Agent Creation Guide ​

Two Modes ​

Contents ​

Quick Start ​

Pipeline at a Glance ​

Auto-Injected Rules ​