> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ultravox.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Protocol & Data Messages

> Protocol documentation for messages exchanged between client and server during Ultravox calls.

Data messages are used to communicate non-audio information during Ultravox calls. These messages enable real-time control and interaction with ongoing conversations.

## Communication Methods

* **Client Data Channels** → Used by our [SDKs](/apps/sdks) and [WebSocket](/apps/websockets) connections for bi-directional, real-time message exchange during calls. This is the primary method for client apps to interact with calls.
* **Data Connection** → Add a [data connection](/api-reference/calls/calls-post#body-data-connection) to your call to receive messages via a separate WebSocket connection. This is particularly useful for:
  * Telephony integrations where the client doesn't support WebRTC
  * Server-side applications that need to monitor call events or route data to external systems
* **REST API** → Inject messages into active calls via HTTP POST requests. See [Sending Messages to Live Calls via REST API](#sending-messages-to-live-calls-via-rest-api) below for detailed implementation guidelines.

## Messages at a Glance

Details on each message type appear below in [Data Message Details](#data-message-details).

### Client-to-Server Messages

| Type           | Message                                                                                         | Description                                                            |
| -------------- | ----------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
| Agent Behavior | [ForcedAgentMessage](#forcedagentmessage)                                                       | Forces the agent to say a specific message or invoke tools.            |
| Call Control   | [HangUp](#hangup)                                                                               | Instructs the agent to end the call with an optional farewell message. |
| Call Control   | [SetOutputMedium](#setoutputmedium)                                                             | Sets server's output medium to text or voice.                          |
| System         | [Ping](#ping)                                                                                   | Measures round-trip data latency.                                      |
| Tools          | [ClientToolResult and DataConnectionToolResult](#clienttoolresult-and-dataconnectiontoolresult) | Contains the result of a tool invocation.                              |
| User Input     | [UserTextMessage](#usertextmessage)                                                             | Used to send a user message to the agent.                              |
| Threads        | [SpawnThread](#spawnthread)                                                                     | Spawns a new parallel thread. See [Threads](/agents/threads).          |

### Server-to-Client Messages

| Type         | Message                                                                                                         | Description                                                  |
| ------------ | --------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------ |
| Conversation | [CallStarted](#callstarted)                                                                                     | Provides some basic information about the call at its start. |
| Conversation | [Transcript](#transcript)                                                                                       | Contains text for an utterance made during the call.         |
| System       | [Debug](#debug)                                                                                                 | Useful for application debugging. Excluded by default.       |
| System       | [PlaybackClearBuffer](#playbackclearbuffer)                                                                     | Used to clear buffered output audio. WebSocket only.         |
| System       | [Pong](#pong)                                                                                                   | Server reply to a ping message.                              |
| System       | [State](#state)                                                                                                 | Indicates the server's current state.                        |
| Threads      | [ThreadSpawned](#threadspawned)                                                                                 | Confirms a thread was successfully created.                  |
| Threads      | [ThreadRejected](#threadrejected)                                                                               | Indicates a thread spawn request was rejected.               |
| Threads      | [ThreadTerminated](#threadterminated)                                                                           | Indicates a thread has stopped running.                      |
| Threads      | [SideGenerationDelta](#sidegenerationdelta)                                                                     | Streaming text delta from a thread's generation.             |
| Threads      | [SideGenerationCompleted](#sidegenerationcompleted)                                                             | A thread's generation has completed.                         |
| Tools        | [ClientToolInvocation and DataConnectionToolInvocation](#clienttoolinvocation-and-dataconnectiontoolinvocation) | Asks the client or data connection to invoke a tool.         |

## Data Message Details

All messages are JSON objects with camelCase keys containing:

* A required `type` field identifying the message type
* Additional fields specific to each message type

### Ping

A message sent by the client to measure round-trip data message latency.

**Message Structure**

```js theme={null}
{
  "type": "ping",
  "timestamp": 1234567890.123
}
```

**Fields**

<ResponseField name="timestamp" type="float" required>
  Unix timestamp with millisecond precision. Client timestamp for latency measurement.
</ResponseField>

### Pong

A message sent by the server in response to a PingMessage. The timestamp is copied from the PingMessage.

**Message Structure**

```js theme={null}
{
  "type": "pong",
  "timestamp": 1234567890.123
}
```

**Fields**

<ResponseField name="timestamp" type="float" required>
  Echoed timestamp from the original ping message.
</ResponseField>

### State

A message sent by the server to indicate its current state.

**Message Structure**

```js theme={null}
{
  "type": "state",
  "state": "listening"
}
```

**Fields**

<ResponseField name="state" type="string" required>
  Current session state.  One of: `idle`, `listening`, `thinking`, or `speaking`.
</ResponseField>

### Transcript

A message containing text transcripts of user and agent utterances.

**Message Structure**

```js theme={null}
{
  "type": "transcript",
  "role": "agent",
  "medium": "voice",
  "text": "Full transcript so far",  // Either text or delta will be set
  "delta": null,
  "final": false,
  "ordinal": 1
}
```

**Fields**

<ResponseField name="role" type="string" required>
  Who emitted the utterance. Must be either "user" or "agent".
</ResponseField>

<ResponseField name="medium" type="string" default="voice">
  The medium through which the utterance was emitted. Either "text" or "voice".
</ResponseField>

<ResponseField name="text" type="string">
  The full text of the transcript so far. Either this or delta will be set, but not both.
</ResponseField>

<ResponseField name="delta" type="string">
  The additional transcript text added since the last transcript message. Either this or text will be set, but not both.
</ResponseField>

<ResponseField name="final" type="boolean" required>
  Whether to expect additional transcript messages for this conversation round.
</ResponseField>

<ResponseField name="ordinal" type="integer" required>
  The ordinal of the message within the current call, used for ordering transcripts.
</ResponseField>

### UserTextMessage

A user message sent via text. The message appears to the agent as if it came from the user.

**Message Structure**

```js theme={null}
{
  "type": "user_text_message",
  "text": "Your message here",
  "urgency": "soon"  // Optional, defaults to "soon"
}
```

**Fields**

<ResponseField name="text" type="string" required>
  The content of the user message.
</ResponseField>

<ResponseField name="urgency" type="string" default="soon">
  Determines whether this message can interrupt the agent and whether it should trigger a generation. Options:

  * immediate → Interrupts the agent if speaking and starts a new generation immediately.
  * soon → Doesn't interrupt but starts a generation at the next opportunity.
  * later → Message is considered during the next natural generation without forcing a new generation.
</ResponseField>

<ResponseField name="threadId" type="string" default="UI">
  Optional thread ID to send this message to. Defaults to "UI" (the main conversation).
</ResponseField>

### SetOutputMedium

Message sent by the client to set the server's output medium.

**Message Structure**

```js theme={null}
{
  "type": "set_output_medium",
  "medium": "voice"
}
```

**Fields**

<ResponseField name="medium" type="string" required>
  Output medium to use. Must be either "voice" or "text".
</ResponseField>

### ClientToolInvocation and DataConnectionToolInvocation

Sent by the server to ask the client or data connection to invoke a tool with the given parameters. The client or data connection is expected to send back a ClientToolResultMessage or DataConnectionToolResultMessage with a matching invocationId.

**Message Structure**

```js theme={null}
{
  "type": "client_tool_invocation", // Or "data_connection_tool_invocation" for data connections
  "toolName": "get_weather",
  "invocationId": "unique-invocation-id",
  "parameters": {
    "location": "Seattle"
  }
}
```

**Fields**

<ResponseField name="toolName" type="string" required>
  Name of the tool to invoke.
</ResponseField>

<ResponseField name="invocationId" type="string" required>
  Unique identifier for this invocation. Must be included in the corresponding result.
</ResponseField>

<ResponseField name="parameters" type="object" required>
  Tool-specific parameters as a JSON object.
</ResponseField>

### ClientToolResult and DataConnectionToolResult

Contains the result of a tool invocation.

**Message Structure**

```js theme={null}
{
  "type": "client_tool_result", // Or "data_connection_tool_result" for data connections
  "invocationId": "matching-invocation-id",
  "result": "Tool execution result",
  "responseType": "tool-response",
  "agentReaction": "speaks",
  "errorType": null,
  "errorMessage": null,
  "updateCallState": null
}
```

**Fields**

<ResponseField name="invocationId" type="string" required>
  Must match the invocationId from the corresponding invocation.
</ResponseField>

<ResponseField name="result" type="string">
  Typically the tool execution result as viewed by the agent, which is often a JSON string. May be omitted for errors.
  For responseTypes other than `tool-response`, this may be a JSON string for an object that further specifies how the response should be handled. See [special response types](https://docs.ultravox.ai/tools/custom/changing-call-state#special-tool-response-types).
</ResponseField>

<ResponseField name="responseType" type="string" default="tool-response">
  Type of response being provided. See [special response types](https://docs.ultravox.ai/tools/custom/changing-call-state#special-tool-response-types).
</ResponseField>

<ResponseField name="agentReaction" type="string" default="speaks">
  How the agent should react. Options: "speaks" (default), "listens", or "speaks-once". See [Agent Responses to Tools](/tools/custom/agent-responses) for more.
</ResponseField>

<ResponseField name="errorType" type="string">
  Error classification if the tool failed. Should be omitted when result is set.

  Options:

  * undefined → Tool with the given name does not exist
  * implementation-error → Tool exists but execution failed
</ResponseField>

<ResponseField name="errorMessage" type="string">
  Human-readable error description if the tool failed. This is not seen by the model but may be used for debugging.
</ResponseField>

<ResponseField name="updateCallState" type="object">
  Optional state updates to apply to the call. See [Tool State](/agents/guiding-agents#tool-state) for more.
</ResponseField>

### Debug

A message sent by the server to communicate debug information. Disabled by default.

**Message Structure**

```js theme={null}
{
  "type": "debug",
  "message": "Debug information here"
}
```

**Fields**

<ResponseField name="message" type="string" required>
  Debug information or diagnostic details.
</ResponseField>

<Note>
  Debug messages are disabled by default and must be explicitly enabled for debugging purposes.
</Note>

### CallStarted

Basic call metadata shared by the server when a call begins.

**Message Structure**

```js theme={null}
{
  "type": "call_started",
  "callId": "550e8400-e29b-41d4-a716-446655440000"
}
```

**Fields**

<ResponseField name="callId" type="string" required>
  The UUID of the call that has started.
</ResponseField>

### PlaybackClearBuffer

Message sent by the server to clear buffered output audio. Integrators should drop as much unplayed output audio as possible for interruptions to function properly.

**Message Structure**

```js theme={null}
{
  "type": "playback_clear_buffer"
}
```

<Note>
  This message is only used with WebSocket connections. Handling this message allows for [larger client buffers](/api-reference/agents/agents-post#body-call-template-medium-server-web-socket-client-buffer-size-ms) while maintaining responsive interrupts. Larger client buffers make choppy audio less likely in the presence of network disruption or resource contention.
</Note>

### ForcedAgentMessage

Forces the agent to say a specific message or invoke tools.

**Message Structure**

```js theme={null}
{
  "type": "forced_agent_message",
  "content": "Text for the agent to say",  // Optional (default: "")
  "toolCalls": [  // Optional array of tool calls
    {
      "id": "unique-invocation-id",  // Optional, generated if not provided
      "name": "tool_name",
      "arguments": {
        "param1": "value1"
      }
    }
  ],
  "knownToolResults": [  // Optional, skips execution of matching tool calls using these results instead
    {
      "invocationId": "unique-invocation-id",
      "result": "Tool execution result",
      "responseType": "tool-response",
      "agentReaction": "speaks",
    }
  ],
  "uninterruptible": false,  // Optional (default: false)
  "urgency": "soon"  // Optional: "immediate" or "soon" (default: "soon")
}
```

**Fields**

<ResponseField name="content" type="string" default="">
  Text content the agent should speak.
</ResponseField>

<ResponseField name="toolCalls" type="array">
  Array of tool invocations to execute.
</ResponseField>

<ResponseField name="knownToolResults" type="array">
  Array of known tool results. If specified, any toolCalls with matching invocationIds will use these results instead of executing the tool. See [ClientToolResult](#clienttoolresult-and-dataconnectiontoolresult) for the structure of each result object.
</ResponseField>

<ResponseField name="uninterruptible" type="boolean">
  If true, prevents interruption while the agent speaks this message. (Note that tools are always uninterruptible.)
</ResponseField>

<ResponseField name="urgency" type="string" default="soon">
  Controls when the message is processed. Must be either "immediate" (may interrupt the user or agent) or "soon" (process at next opportunity).
</ResponseField>

<ResponseField name="threadId" type="string" default="UI">
  Optional thread ID to send this message to. Defaults to "UI" (the main conversation).
</ResponseField>

### HangUp

Instructs the agent to end the call with an optional farewell message.

**Message Structure**

```js theme={null}
{
  "type": "hang_up",
  "message": "Goodbye!"
}
```

**Fields**

<ResponseField name="message" type="string" default="">
  Final message to speak before ending the call.
</ResponseField>

## Thread Messages

The following messages are used to manage [Threads](/agents/threads) — parallel conversations that run alongside the main call. For conceptual background and usage examples, see the [Threads guide](/agents/threads).

### SpawnThread

Spawns a new parallel thread that forks from the parent's conversation history. See [Threads](/agents/threads) for a full explanation.

**Message Structure**

```js theme={null}
{
  "type": "spawn_thread",
  "newThreadId": "my-thread-1",       // Optional, auto-generated if omitted
  "parentThreadId": "UI",             // Optional, defaults to "UI" (main conversation)
  "ifExists": "reject",              // Optional, defaults to "reject"
  "additionalMessages": [             // Optional
    {
      "type": "user_text_message",
      "text": "Perform this background task."
    }
  ],
  "toolFilter": {                     // Optional
    "allowedTools": ["searchDatabase"]
  },
  "limits": {                         // Optional
    "generationLimit": 5,
    "threadOutputTokenLimit": 2000
  }
}
```

**Fields**

<ResponseField name="newThreadId" type="string">
  Unique identifier for the new thread. If omitted, an id will be auto-generated.
</ResponseField>

<ResponseField name="parentThreadId" type="string" default="UI">
  The thread to fork from. Defaults to "UI" (the main conversation). If the referenced thread has failed or doesn't exist, this spawn attempt will be rejected.
</ResponseField>

<ResponseField name="ifExists" type="string" default="reject">
  Controls behavior when `newThreadId` matches an existing thread. Options:

  * `reject` (default) → The spawn request is rejected and a [ThreadRejected](#threadrejected) message is sent.
  * `replace` → The existing thread is terminated and a new thread is spawned with the same ID. Useful for re-syncing a thread with the current conversation state.
</ResponseField>

<ResponseField name="additionalMessages" type="array">
  Messages to append to the forked history before the thread starts generating. Each element must be either a [UserTextMessage](#usertextmessage) or [ForcedAgentMessage](#forcedagentmessage). Every ForcedAgentMessage except the last must have known results for each tool call. The last may have unmatched calls, in which case the thread will begin by executing those.
</ResponseField>

<ResponseField name="toolFilter" type="object">
  Restricts which tools the thread can use. Specify either `allowedTools` (allowlist) or `disallowedTools` (blocklist), or both. If both are provided, the allowlist is applied first and then the blocklist is applied to the remaining tools. If neither is specified (or this field is omitted), the thread will be allowed to use all tools.
</ResponseField>

<ResponseField name="limits" type="object">
  Resource constraints for the thread. Available fields:

  * `threadOutputTokenLimit` (integer) → Max total output tokens for the thread.
  * `threadFuzzyInputTokenLimit` (integer) → Approximate total (uncached) input token limit.
  * `generationLimit` (integer) → Max LLM generation rounds.
  * `generationOutputTokenLimit` (integer) → Max output tokens per generation.
  * `generationFuzzyInputTokenLimit` (integer) → Approximate (uncached) input token limit per generation.

  Limits are a useful way to constrain model usage (and thus costs) for a thread, but leaving threads unconstrained
  is also totally reasonable, especially when the context provided to them keeps them focused on a specific task.
</ResponseField>

### ThreadSpawned

Sent by the server to confirm a thread was successfully created.

**Message Structure**

```js theme={null}
{
  "type": "thread_spawned",
  "threadId": "my-thread-1"
}
```

**Fields**

<ResponseField name="threadId" type="string" required>
  The ID of the newly created thread.
</ResponseField>

### ThreadRejected

Sent by the server when a thread spawn request cannot be fulfilled.

**Message Structure**

```js theme={null}
{
  "type": "thread_rejected",
  "threadId": "my-thread-1",
  "reason": "Thread with this ID already exists"
}
```

**Fields**

<ResponseField name="threadId" type="string" required>
  The thread ID from the spawn request.
</ResponseField>

<ResponseField name="reason" type="string" required>
  Why the thread could not be created.
</ResponseField>

### ThreadTerminated

Sent by the server when a thread stops running.

**Message Structure**

```js theme={null}
{
  "type": "thread_terminated",
  "threadId": "my-thread-1",
  "reason": "limit reached"
}
```

**Fields**

<ResponseField name="threadId" type="string" required>
  The ID of the terminated thread.
</ResponseField>

<ResponseField name="reason" type="string" required>
  Why the thread was terminated. Common reasons include: limit reached or cancellation.
</ResponseField>

### SideGenerationDelta

Streaming text delta from a (non-UI) thread's LLM generation, similar to a transcript delta.

**Message Structure**

```js theme={null}
{
  "type": "side_generation_delta",
  "threadId": "my-thread-1",
  "delta": "Here is the next chunk of text..."
}
```

**Fields**

<ResponseField name="threadId" type="string" required>
  The ID of the thread producing this generation.
</ResponseField>

<ResponseField name="delta" type="string" required>
  The incremental text produced since the last delta message.
</ResponseField>

### SideGenerationCompleted

Sent when a (non-UI) thread's generation round finishes successfully.

**Message Structure**

```js theme={null}
{
  "type": "side_generation_completed",
  "threadId": "my-thread-1",
  "text": "Full generated response text",
  "toolCalls": []
}
```

**Fields**

<ResponseField name="threadId" type="string" required>
  The ID of the thread that completed generation.
</ResponseField>

<ResponseField name="text" type="string" required>
  The complete generated text for this generation round.
</ResponseField>

<ResponseField name="toolCalls" type="array">
  Any tool calls the LLM requested during this generation. The thread will execute these automatically.
</ResponseField>

## Sending Messages to Live Calls via REST API

The [Send Data Message to Call](/api-reference/calls/calls-send-data-message-post) endpoint allows you to inject messages into active calls (calls that are joined and not yet ended).

### Supported Message Types

* [ForcedAgentMessage](#forcedagentmessage)
* [HangUp](#hangup)
* [UserTextMessage](#usertextmessage)

### Responses

Successful messages sent via the REST API will receive a `204 No Content` response with an empty body.

Potential error responses include:

* `401 Unauthorized`: Missing or invalid API key
* `403 Forbidden`: Insufficient authorization
* `422 Unprocessable Entity`: Call is not active (either not joined yet or already ended)
