Skip to main content
DialNexa Voice AI works by moving each call through a loop: receive audio, transcribe speech, decide the next response with the configured agent version, optionally call tools, synthesize speech, and store the result in Call History.

Screenshot placeholder: Voice AI call loop

Add a call loop diagram or dashboard screenshot. Suggested alt text: DialNexa Voice AI call loop from caller audio to Deepgram or Soniox transcript, LLM response, ElevenLabs or Cartesia voice output, and call log.
If you can follow the loop, you can debug most calls without guessing. Guessing is dramatic, but logs are cheaper.

The Runtime Loop

1

Caller audio arrives

Audio enters through a Plivo number, SIP trunk, web call, batch call, workflow call node, API call, or test call.
2

The transcriber listens

Deepgram or Soniox converts speech into text. The transcript is later shown in Call History as realtime and, where available, post-call text.
3

The agent decides

The published agent version supplies prompt, system prompt, language, LLM model, dynamic variables, functions, and safety settings.
4

Tools may run

Functions and integrations can end a call, book a calendar event, call an API, send a WhatsApp message, send an email, or trigger another configured action.
5

The voice speaks

ElevenLabs or Cartesia synthesizes the response. Audio Cache can reduce latency for repeated exact phrases.
6

Evidence is saved

Call History receives status, summary, transcript, recording URL, post-call analysis, transfer details, Audio Cache data, and metadata.

Provider Work At Each Runtime Layer

LayerProvider choicesWhy users should care
Audio routePlivo, SIP trunking, web call.Changes number ownership, audio quality, routing, and whether phone network issues are involved.
Speech to textDeepgram or Soniox.Changes transcript accuracy, language fit, turn timing, and Response Eagerness support.
ReasoningOpenAI, Google, or Groq.Changes instruction following, latency, structured behavior, and fallback strategy.
Text to speechElevenLabs or Cartesia.Changes voice identity, pronunciation, streaming behavior, and cache compatibility.
External actionsWati, Resend, Custom Functions, webhooks.Changes what the call or workflow can do outside DialNexa.

Runtime Input Versus Saved Evidence

Do not confuse what starts a call with what proves it happened.
LayerExamplesWhere reviewed
InputRecipient number, metadata, dynamic variables, workflow lead variables, selected outbound number.Batch setup, workflow lead, API request, or test call modal.
Agent versionPrompt, model, voice, transcriber, welcome mode, functions, post-call fields, security settings.Agent builder and version history.
Call resultStatus, duration, sentiment, end reason, transcript, summary, recording, extracted fields.Call History and call detail page.

What Can Change A Reply

The same caller sentence can lead to different behavior when these settings change.

Prompt and system prompt

Instruction wording decides goal, boundaries, escalation rules, and acceptable answers.

Dynamic variables

Caller-specific values can change greeting, eligibility, due date, location, or transfer destination.

Functions

Tools add actions the model can call during the conversation.

LLM and temperature

Model choice and temperature affect reasoning style and consistency.

Debug By Layer

Check recording quality, phone path, SIP trunk behavior, web microphone, and Denoising Mode.
Check language, Deepgram or Soniox selection, background noise, and whether the caller spoke over the agent.
Check prompt, dynamic variables, functions, knowledge source, model family, and temperature.
Check Response Eagerness, Audio Cache, fallback LLM, function latency, and integration action placement.

Speech To Text

Understand Deepgram and Soniox behavior.

LLM Behavior

Tune reasoning and fallback behavior.

Provider Selection Guide

Choose the complete provider stack.

Text To Speech

Choose ElevenLabs or Cartesia.

Integrations

Understand Wati, Resend, and workflow actions.