Skip to content

Ultravox SDK

The Ultravox REST API is used to create calls but you must use one of the Ultravox client SDKs to join and end calls. This page primarily uses examples in JavaScript. The concepts are the same across all the different SDK implementations.

Ultravox Session

The core of the SDK is the UltravoxSession. The session is used to join and leave calls.

import { UltravoxSession } from 'ultravox-client';
const session = new UltravoxSession();
const state = await session.joinCall('wss://your-call-join-url');
session.leaveCall();

Methods

The UltravoxSession contains methods for joining/leaving a call, for sending text messages to the model and for muting the microphone/speaker.

joinCall()

joinCall(joinUrl: string): UltravoxSessionState

Joins a call. Requires a joinUrl (string). Returns an UltravoxSessionState.

leaveCall()

async leaveCall(): Promise<void>

Leaves the current call. Returns a promise (with no return value) that resolves when the call has successfully been left.

sendText()

sendText(text: string): void

Sends a message to the model. Model replies via text transcript. Requires inputting the text message (string).

isMicMuted()

isMicMuted(): boolean

Returns a boolen indicating if the end user’s microphone is muted. This is scoped to the Ultravox SDK and does not detect muting done by the user outside of your application.

isSpeakerMuted()

isSpeakerMuted(): boolean

Returns a boolen indicating if the speaker (the agent’s voice output) is muted. This is scoped to the Ultravox SDK and does not detect muting done by the user outside of your application.

muteMic()

muteMic(): void

Mutes the end user’s microphone. This is scoped to the Ultravox SDK.

unmuteMic()

unmuteMic(): void

Unmutes the end user’s microphone. This is scoped to the Ultravox SDK.

muteSpeaker()

muteSpeaker(): void

Mutes the end user’s speaker (the agent’s voice output). This is scoped to the Ultravox SDK.

unmuteSpeaker()

unmuteSpeaker(): void

Unmutes the end user’s speaker (the agent’s voice output). This is scoped to the Ultravox SDK.

Ultravox Session State

When a call is joined, an UltravoxSessionState is returned. This object returns the current status, can be used to get text transcripts of the call, and surfaces debug messages that are helpful when building your application.

Status

The session status is based on the UltravoxSessionStatus enum and can be one of the following:

statusdescription
disconnectedSession is not connected. This is the initial state prior to joinCall.
disconnectingSession is in the process of disconnecting.
connectingSession is establishing the connection.
idleSession is connected but not yet active.
listeningListening to the end user.
thinkingThe model is processing/thinking.
speakingThe model is speaking.

Status Events

The status can be retrieved by adding an event listener to the state. Building on what we did above:

// Listen for status changing events
state.addEventListener('ultravoxSessionStatusChanged', (event) => {
console.log('Session status changed: ', event.state);
});

Transcripts

Sometimes you may want to augment the audio with text transcripts (e.g. if you want to show the end user the model’s output in real-time). Transcripts can be retrieved by adding an event listener to state:

import { UltravoxSession } from 'ultravox-client';
const session = new UltravoxSession();
const state = await session.joinCall('wss://your-call-join-url');
// Listen for transcripts changing events
state.addEventListener('ultravoxTranscriptsChanged', (event) => {
console.log('Transcripts updated: ', event.transcripts);
console.log('Current session status: ', event.state); // Session status is also available on the event
});
session.leaveCall();

Transcripts are an array of transcript objects. Each transcript has the following properties:

propertytypedefinition
textstringText transcript of the speech from the end user or the agent.
isFinalbooleanTrue if the transcript represents a complete utterance. False if it is a fragment of an utterance that is still underway.
speakerRoleEither “user” or “agent”. Denotes who was speaking.
mediumMediumEither “voice” or “text”. Denotes how the message was sent.

Debug Messages

The state object also provides debug messages. Debug messages must be enabled when creating the UltravoxSession and then are available via an event listener similar to status and transcripts:

import { UltravoxSession } from 'ultravox-client';
const debugMessages = new Set(["debug"]);
const session = new UltravoxSession({ experimentalMessages: debugMessages });
const state = await session.joinCall('wss://your-call-join-url');
// Listen for debug messages
state.addEventListener('ultravoxExperimentalMessage', (msg) => {
console.log('Got a debug message: ', JSON.stringify(msg));
});
session.leaveCall();

Debug Message: Tool Call

When the agent invokes a tool, the message contains the function, all arguments, and an invocation ID:

Terminal window
LLM response: Tool calls: [FunctionCall(name='createProfile', args='{"firstName":"Ron","lastName":"Burgandy","organization":"Fixie.ai","useCase":"creating a talking AI news reporter"}', invocation_id='call_D2qQVS8OQc998aMEw5PRa9cF')]

Debug Message: Tool Call Result

When the tool call completes, the message contains an array of messages. Multiple tools can be invoked by the model. This message array will conatain all the calls followed by all the results. These messages are also available via List Call Messages.

Here’s an example of what we might see from a single tool invocation:

Terminal window
Tool call complete.
Result: [
role: MESSAGE_ROLE_TOOL_CALL ordinal: 6 text: "{\"firstName\":\"Ron\",\"lastName\":\"Burgandy\",\"organization\":\"Fixie.ai\",\"useCase\":\"creating a talking AI news reporter\"}" tool_name: "createProfile" invocation_id: "call_D2qQVS8OQc998aMEw5PRa9cF" tool_id: "aa737e12-0989-4adb-9895-f387f40557d8" ,
role: MESSAGE_ROLE_TOOL_RESULT ordinal: 7 text: "{\"firstName\":\"Ron\",\"lastName\":\"Burgandy\",\"emailAddress\":null,\"organization\":\"Fixie\",\"useCase\":\"creating a talking AI news reporter\"}" tool_name: "createProfile" invocation_id: "call_D2qQVS8OQc998aMEw5PRa9cF" tool_id: "aa737e12-0989-4adb-9895-f387f40557d8"
]

SDK Implementations

There are currently four implementations of the SDK available:

Flutter

Flutter logo
flutter add ultravox_client
Get it on pub.dev

JavaScript

JavaScript logo
npm install ultravox-client
Available in the npm registry

Python

Python logo
pip install ultravox-client
More info on PyPi