Calls
Calls are what drive all speech-to-speech interactions between your end users and AI. At their most basic level, calls consist of a set of instructions to instruct the LLM how to behave (i.e. the systemPrompt
) and the selection of the voice the AI should use when speaking.
List Calls
GET /callsLists all calls that have been created. Account scoped.
Parameters
cursor | Pagination cursor. |
none |
callId | string | Unique identifier for the call. |
clientVersion | string | The version of the client that was used to join the call. Optionally can contain additional string defined during joinCall |
created | string | Datetime in UTC when the call was created. |
ended | string | Datetime in UTC when the call was ended. Will be null upon call creation and while call is underway. |
model | string | Name of the model used for the call. |
systemPrompt | string | The system prompt used to create the call. |
temperature | float | The temperature setting used for the model in this call. |
voice | string | The name of the AI voice used for the call. |
languageHint | string | BCP47 identifier used as a hint to indentify the user’s spoken language. |
maxDuration | string | Maximum length (in seconds) that was set for the call. |
timeExceededMessage | string | Message that the agent will say if the call reaches the maxDuration . |
joinUrl | string | URL to use with the ultravox client library to join (AKA start) the call. |
Create Call
POST /callsCreates a new call using the specified system prompt. Account scoped.
Optional parameters can be used to specify voice, temperature, language hint, and recordingEnabled.
An optional query parameter called priorCallId
can be provided to continue a previous conversation. If used, all properties of the prior call (e.g. systemPrompt, voice, etc.) will be used for the new call. Can also be used in combination with overriding individual properties (e.g. inherit all properties but override the voice).
Parameters
priorCallId | string | The UUID of an existing call. If provided, the new call will inherit all call properties (unless overridden in the current request body). The prior call’s message history will be used in place of initialMessages . Setting initialMessages in the body is not allowed. |
firstSpeaker | string | Who should talk first when the call starts. Typically set to FIRST_SPEAKER_USER for outgoing calls and left as the default (FIRST_SPEAKER_AGENT) otherwise. |
inactivityMessages | array | Messages spoken by the agent when the user is inactive for the specified duration. See below for more information. |
initialMessages | array | The conversation history to start from for this call. See below for more information. |
initialOutputMedium | string | The initial medium (MESSAGE_MEDIUM_VOICE or MESSAGE_MEDIUM_TEXT) to use for the call. Defaults to MESSAGE_MEDIUM_VOICE. Once the call has started, the output medium can be changed by the client using setOutputMedium . |
joinTimeout | string | A timeout for joining the call. Specified in seconds. Defaults to ”60s” (1 minute). |
languageHint | string | A BCP47 language code that may be used to guide speech recognition and synthesis. Best effort is made to select the closest supported language. |
maxDuration | string | Maximum length (in seconds) for the call. Must contain s and can be fractional. E.g. 300s or 245.5s . Used to limit the length of the call. Optionally, set timeExceededMessage to have the agent say a message when the allotted time is reached.For the free plan, defaults to amount of free time remaining (less any calls that are currently in-progress). For paid plans, defaults to the system maximum of one hour (“3600s”). |
medium | object | Details about a call’s protocol. By default, calls occur over WebRTC using the Ultravox client SDK. Setting a different call medium will prepare the server for a call using a different protocol. At most one call medium may be set. See Connection Options for more. |
model | string | The model to be used for the call. If not specified, defaults to fixie-ai/ultravox . See available models. |
recordingEnabled | boolean | A recording will be saved for the call when set to true . Recording can be retrieved via GET /calls/{call_id}/recording . Default value of false . |
selectedTools | array | Each object is a tool selected for the call. |
systemPrompt required | string | The system prompt to use for the AI. |
temperature | float | The temperature setting for the model. Value between 0.0 and 1.0. (default: 0.0) |
timeExceededMessage | string | Message that the agent will say if the call reaches the maxDuration . |
transcriptOptional | boolean | Defaults to true . Setting to false (not recommended) enables live user transcripts at the expense of latency. Furthermore, these transcripts may not match what the model actually hears. |
voice | string | The voice the AI will use for speaking. If not specified, defaults to Mark . See List Voices for all available voices. Please contact us if you have other voice requirements. |
callId | string | Unique identifier for the call. |
clientVersion | string | The version of the client that was used to join the call. Optionally can contain additional string defined during joinCall |
clientVersion | string | The version of the client that joined this call. |
created | string | Datetime in UTC when the call was created. |
ended | string | Datetime in UTC when the call was ended. Will be null upon call creation and while call is underway. |
endReason | string | The reason the call ended. |
firstSpeaker | string | Who was supposed to talk first when the call started. Typically set to FIRST_SPEAKER_USER for outgoing calls and left as the default (FIRST_SPEAKER_AGENT) otherwise. |
inactivityMessages | array | Messages spoken by the agent when the user is inactive for the specified duration. See below for more information. |
initialOutputMedium | string | The initial medium (MESSAGE_MEDIUM_VOICE or MESSAGE_MEDIUM_TEXT) to use for the call. Defaults to MESSAGE_MEDIUM_VOICE. Once the call has started, the output medium can be changed by the client using setOutputMedium . |
joinTimeout | string | A timeout for joining the call. Specified in seconds. Defaults to “0s” (5 minutes). |
joinUrl | string | URL to use with the ultravox client library to join (AKA start) the call. |
languageHint | string | BCP47 identifier used as a hint to indentify the user’s spoken language. |
maxDuration | string | Maximum length (in seconds) that was set for the call. |
model | string | Name of the model used for the call. |
systemPrompt | string | The system prompt used to create the call. |
temperature | float | The temperature setting used for the model in this call. |
timeExceededMessage | string | Message that the agent will say if the call reaches the maxDuration . |
transcriptOptional | boolean | Defaults to true . Setting to false (not recommended) enables live user transcripts at the expense of latency. Furthermore, these transcripts may not match what the model actually hears. |
voice | string | The name of the AI voice used for the call. |
Get Call
GET /calls/{call_id}Gets details for the call with call_id
specified in the path. Account scoped.
Parameters
callId required | Unique identifier for the call to retrieve. |
None |
callId | string | Unique identifier for the call. |
created | string | Datetime in UTC when the call was created. |
ended | string | Datetime in UTC when the call was ended. Will be null upon call creation and while call is underway. |
model | string | Name of the model used for the call. |
systemPrompt | string | The system prompt used to create the call. |
temperature | float | The temperature setting used for the model in this call. |
voice | string | The name of the AI voice used for the call. |
languageHint | string | BCP47 identifier used as a hint to indentify the user’s spoken language. |
joinUrl | string | URL to use with the ultravox client library to join (AKA start) the call. |
List Call Messages
GET /calls/{call_id}/messagesLists all messages generated during the given call.
Parameters
callId required | Unique identifier of the call for which messages are being retrieved. |
cursor | Pagination cursor. |
No request body |
next | string | URL with the cursor value for the next page of results. |
previous | string | URL with the cursor value for the previous page of results. |
results | array | Array of message objects. Each message object contains: |
role | string | Role that generated the message. Corresponds to one of the following: MESSAGE_ROLE_USER or MESSAGE_ROLE_AGENT . |
text | string | The message text. |
Get Call Recording
GET /calls/{call_id}/recordingReturns a link to the recording of the call (via a 302 redirect to the file location). The recording only becomes available after the call ends. If the recording is not yet available, a 425 (Too Early) HTTP status will be returned.
Parameters
callId required | Unique identifier of the call for which the recording is being retrieved. |
No request body |
detail | string | Only returned if call recording was not enabled. If a recording was enabled, there is no response body and the call recording location is provided via a 302 redirect. |
List Call Stages
GET /calls/{call_id}/stagesLists all stages that occurred during the specified call. Stages represent distinct segments of the conversation where different parameters (e.g. system prompt or tools) may have been used.
Parameters
callId required | Unique identifier of the call for which stages are being retrieved. |
cursor | Pagination cursor. |
callId | string | Unique identifier for the call. |
callStageId | string | Unique identifier for this stage of the call. |
created | string | Datetime in UTC when the stage was created. |
inactivityMessages | array | Inactiviy messages used for this stage. |
languageHint | string | BCP47 language code used during this stage. |
model | string | Name of the model used for this stage. |
systemPrompt | string | The system prompt used during this stage. |
temperature | float | The temperature setting used during this stage. |
timeExceededMessage | string | The time exceeded message used for this stage. |
voice | string | The voice used during this stage. |
Get Call Stage
GET /calls/{call_id}/stages/{call_stage_id}Retrieves details for a specific stage of a call.
Parameters
callId required | Unique identifier of the call. |
callStageId required | Unique identifier of the stage to retrieve. |
callId | string | Unique identifier for the call. |
callStageId | string | Unique identifier for this stage. |
created | string | Datetime in UTC when the stage was created. |
inactivityMessages | array | Inactiviy messages used for this stage. |
languageHint | string | BCP47 language code used during this stage. |
model | string | Name of the model used for this stage. |
systemPrompt | string | The system prompt used during this stage. |
temperature | float | The temperature setting used during this stage. |
timeExceededMessage | string | The time exceeded message used for this stage. |
voice | string | The voice used during this stage. |
List Call Stage Messages
GET /calls/{call_id}/stages/{call_stage_id}/messagesLists all messages that were exchanged during a specific stage of a call.
Parameters
callId required | Unique identifier of the call. |
callStageId required | Unique identifier of the stage. |
cursor | Pagination cursor. |
next | string | URL with the cursor value for the next page of results. |
previous | string | URL with the cursor value for the previous page of results. |
results | array | Array of message objects containing: |
role | string | Role that generated the message. One of: MESSAGE_ROLE_USER , MESSAGE_ROLE_AGENT , MESSAGE_ROLE_TOOL_CALL , or MESSAGE_ROLE_TOOL_RESULT . |
text | string | The message text, tool arguments for tool_call messages, or tool results for tool_result messages. |
invocationId | string | For tool messages, the invocation ID used to pair tool calls with their results. |
toolName | string | For tool messages, the name of the tool being called. |
errorDetails | string | For failed tool calls, additional debugging information. |
List Call Stage Tools
GET /calls/{call_id}/stages/{call_stage_id}/toolsLists all tools that were available during a specific stage of a call.
Parameters
callId required | Unique identifier of the call. |
callStageId required | Unique identifier of the stage. |
callToolId | string | Unique identifier for this tool instance. |
toolId | string | Reference to the original tool definition. |
name | string | The possibly overridden name of the tool. |
definition | object | The tool definition containing parameters and implementation details. |
List Call Tools
GET /calls/{call_id}/toolsLists all tools that were available at any point during the call.
Parameters
callId required | Unique identifier of the call. |
callToolId | string | Unique identifier for this tool instance. |
toolId | string | Reference to the original tool definition. |
name | string | The possibly overridden name of the tool. |
definition | object | The tool definition containing parameters and implementation details. |
More Info
This section contains additional details for some properties.
inactivityMessages
Inactivity messages allow your agent to gracefully handle periods of user silence and end the call after a period of user inactivity. This feature helps maintain engagement and ensures calls don’t remain open indefinitely when users have disconnected or finished their interaction.
- Messages are Ordered → Messages are delivered by the agent in the order provided.
- Message Durations are Cumulative → The first message is delivered when the user has been inactive for its duration. Each subsequent message m is delivered its duration after message m-1 (provided the user remains inactive).
- User Interactions Reset → Any activity from the user will reset the message sequence.
- Different Behaviors → Messages can have different end behaviors and can terminate the call.
Best Practices
- Keep messages concise and natural-sounding.
- Start with friendly check-in messages before moving to call termination.
- Provide clear context in messages if the call will be terminated.
Message Format
When creating a new call, inactivityMessages
are an array of message objects. Each message must provide the following:
- duration
- string - The duration (in seconds) after which the message should be spoken.
- pattern -
^-?(?:0|[1-9][0-9]{0,11})(?:\.[0-9]{1,9})?s$
- examples - ”60s”, “5.5s”
- message
- string - The message to speak.
- examples - “Are you still there?”, “If there’s nothing else, I will end the call now.”
- endBehavior
- string - The behavior to exhibit when the message is finished being spoken. Must be one of the enumerated values.
- enum
END_BEHAVIOR_UNSPECIFIED
- Default. The system will continue to wait for user input.END_BEHAVIOR_HANG_UP_SOFT
- Will end the call unless the user interacts while the agent is delivering the message.END_BEHAVIOR_HANG_UP_STRICT
- Will end the call after speaking the message, regardless of whether the user interrupts.
Example
Here’s what would happen based on the example above:
- Call starts.
- After 30 seconds of no user interaction, agent says “Are you still there?“.
- If user interacts, call continues. If no user interaction occurs for another 15 seconds, agent says “If there’s nothing else, may I end the call?“.
- If no user interaction occurs for another 10 seconds, agent says the provided message and the call ends unless the agent is interrupted during this final message.
initialMessages
When creating a new call or a new call stage, you can provide messages to the agent via initialMessages
. By default, new calls don’t have initial messages and call stages inherit the prior stage’s messages. New calls will inherit messages if priorCallId
is set.
These messages can serve the purpose of giving the agent call history or to give examples for few-shotting (e.g. if you want the agent to learn how to respond in a specific way to user input).
Message Format
initialMessages
must be an array of message objects where each message contains a role
and text
. See “Response” under List Call Messages above for more details.
Here’s an example:
Using Mistral
When using fixie-ai/ultravox-mistral-nemo-12B
:
- Empty System Prompt → Set
systemPrompt
to an empty string (""
). - Prompt in Initial Messages → Add system prompt instructions as the first user message in
initialMessages
. - Proper Turns → Maintain strict user > agent > user message alternation.
Here’s an example of what the request body for creating the call might look like: