Methods
The core of the SDK is theUltravoxSession
. The session is used to join and leave calls. The UltravoxSession
contains methods for:
- Joining/leaving a call
- Sending text messages to the agent
- Changing the output medium for how the agent replies
- Registering client tools
- Muting the microphone/speaker
joinCall()
Joins a call.The joinUrl that was returned from the
Create Call
request.Optional string that can be used for application version tracking. Will be appended to the call and be available in the
clientVersion
field in the Get Call
response.leaveCall()
Leaves the current call. Returns a promise (with no return value) that resolves when the call has successfully been left.sendText()
Sends a text message to the agent. IfdeferResponse
is set, the agent will not respond (i.e. no LLM generation will be done).
The message to send to the agent.
Set to
true
to skip LLM generation (agent won’t reply). Can be used to provide additional guidance to the model.setOutputMedium()
Sets the agent’s output medium for future utterances. If the agent is currently speaking, this will take effect at the end of the agent’s utterance.How replies are communicated. Must be either
'text'
or 'voice'
.registerToolImplementation()
Registers a client tool implementation with the given name. If the call is started with a client-implemented tool, this implementation will be invoked when the model calls the tool.The name of the tool. Must match what is defined in
If
selectedTools
during Create Call
.If
nameOverride
is set then must match that name. Otherwise must match modelToolName
.The function that implements the tool’s logic. This is a function that:Accepts
parameters
→ An object containing key-value pairs for the tool’s parameters. The keys will be strings.Returns → Either a string result, or an object with a result string and a responseType, or a Promise that resolves to one of these.For example:registerToolImplementations()
Convenience batch wrapper forregisterToolImplementation
.
An object where each key (a string) represents the name of the tool and each value is a
ClientToolImplementation
function.isMicMuted()
Returns a boolen indicating if the end user’s microphone is muted. This is scoped to the Ultravox SDK and does not detect muting done by the user outside of your application.isSpeakerMuted()
Returns a boolen indicating if the speaker (the agent’s voice output) is muted. This is scoped to the Ultravox SDK and does not detect muting done by the user outside of your application.muteMic()
Mutes the end user’s microphone. This is scoped to the Ultravox SDK.unmuteMic()
Unmutes the end user’s microphone. This is scoped to the Ultravox SDK.muteSpeaker()
Mutes the end user’s speaker (the agent’s voice output). This is scoped to the Ultravox SDK.unmuteSpeaker()
Unmutes the end user’s speaker (the agent’s voice output). This is scoped to the Ultravox SDK.Client Tools
Ultravox has robust support for tools. The SDK has support for client tools. Client tools will be invoked in your client code and enable you to add interactivity in your app that is driven by user interactions with your agent. For example, your agent could choose to invoke a tool that would trigger some UI change.Message Size LimitMessages larger than 15-16KB may cause timeouts. Keep payload sizes within this limit.
Creating Client Tools Client tools are defined just like “server” tools with three exceptions:
1
'client' not 'http'
You don’t add the URL and HTTP method for client tools.Instead, you add
"client": {}
to the tool definition.2
Register Tool with Client
Your client tool must be registered in your client code. Here’s a snippet that might be found in client code to register the client tool and implement the logic for the tool.See the SDK method
registerToolImplementation()
for more information.Registering a Client Tool
3
Only Body Parameters Allowed
Unlike server tools (which accept parameters passed by path, header, body, etc.), client tools only allow parameters to be passed in the body of the request. That means client tools will always have parameter location set like this:
Session Status
TheUltravoxSession
exposes status. Based on the UltravoxSessionStatus
enum, status can be one of the following:
status | description |
---|---|
disconnected | Session is not connected. This is the initial state prior to joinCall. |
disconnecting | Session is in the process of disconnecting. |
connecting | Session is establishing the connection. |
idle | Session is connected but not yet active. |
listening | Listening to the end user. |
thinking | The model is processing/thinking. |
speaking | The model is speaking. |
Get Session Status Events
Transcripts
Sometimes you may want to augment the audio with text transcripts (e.g. if you want to show the end user the model’s output in real-time). Transcripts can be retrieved by adding an event listener:Get Transcripts
Debug Messages
No GuaranteeDebug messages from Ultravox should be treated as debug logs. They can change regularly and don’t have a contract. Relying on the specific structure or content should be avoided.
UltravoxSession
object also provides debug messages. Debug messages must be enabled when creating a new session and then are available via an event listener similar to status and transcripts:
Get Debug Messages