Create Agent
Create a new AI voice agent with a custom voice, LLM, prompt, and telephony settings.
When to use this
Use this endpoint to programmatically create agents as part of your onboarding flow, template system, or multi-tenant setup. For example, if each of your customers needs their own agent pre-configured with their business details, this lets you spin them up automatically.Key fields
pipeline_type: UseCascadedfor the standard STT, LLM, and TTS pipeline. UseSpeech_To_Speechfor realtime speech models such as GPT realtime.prompts.prompt_text: The system prompt that shapes the agent’s personality, instructions, and scope. Write it clearly; the LLM follows this closely.voice: Pass a voiceidfrom List Voices, or configure it inline.speech.transcriber_id: Selects the primary STT provider for cascaded agents. Speech to Speech agents do not use a separate transcriber.speech.fallback_stt_enabled: Enables a backup STT provider for cascaded agents. When enabled, also sendspeech.stt_fallback_transcriber_id.speech.audio_cache_enabled: Caches repeated TTS phrases for lower latency on cascaded agents. Speech to Speech agents do not use TTS audio cache.telephony.call_limits: Set how long a call can run, how long to wait before ending on silence, and how long to ring before giving up.functions: Tools the agent can invoke mid-call, such as calendar booking, CRM lookups, or custom webhooks.analysis.postcall_analysis: Fields to extract from the transcript after each call (e.g., CSAT score, intent, outcome).metadatavia dynamic variables: You can reference{{variable_name}}in your prompt; values are injected at call time from the call’smetadataobject.
Pipeline Types
| Pipeline type | Use it when | Notes |
|---|---|---|
Cascaded | You want separate transcriber, LLM, and voice controls. | This is the default for single prompt agents and supports fallback STT, voice model pricing, and Audio Cache. |
Speech_To_Speech | You want a realtime speech model to listen and speak directly. | The model selector is limited to speech-to-speech models, and voice model, transcriber, and Audio Cache controls are not used. |
Request example
Request
Response
Authorizations
Pass your API key as a Bearer token in the Authorization header.
Body
Human-readable name for the agent.
100"Customer Support Bot"
Set true to publish this agent after creation.
true
Set true to update an existing published agent deployment.
true
Optional folder ID to organize this agent.
"fld_abc123"
Language configuration.
Who speaks first when the call connects.
user, agent "agent"
Whether the caller can interrupt the agent mid-sentence.
true
Agent builder type. Speech to Speech uses Single_Prompt_Agent with pipeline_type set to Speech_To_Speech.
Single_Prompt_Agent, Conversational_Flow_Agent "Single_Prompt_Agent"
Pipeline subtype for single prompt agents. Defaults to Cascaded when omitted.
Cascaded, Speech_To_Speech "Speech_To_Speech"
Response
Agent created successfully.
"agt_abc123"
"org_xyz"
null
1
false
"2024-03-01T10:00:00.000Z"
"2024-03-01T10:00:00.000Z"
null
Pipeline subtype for single prompt agents. Cascaded uses separate STT, LLM, and TTS components. Speech_To_Speech uses a realtime speech model.
Cascaded, Speech_To_Speech "Cascaded"