Handling IVR Flows
Build interactive voice response systems with keypad input and DTMF tones.
Ultravox provides comprehensive support for DTMF (Dual-Tone Multi-Frequency) tones, enabling both sending and receiving tones during phone calls. This enables AI agents to interact with traditional phone systems and allows you to build voice applications that can respond to keypad inputs.
Due to the audio codec used in WebRTC connections, DTMF tones are inaudible when using WebRTC. The playDtmfSounds
tool is intended for use with telephony integrations.
Receiving DTMF Tones
Ultravox automatically converts incoming DTMF tones to text, making it easy to build interactive voice applications that respond to keypad input. When a caller presses keys on their phone keypad, the tones are converted to text that your AI agent can understand and respond to.
For example, if a caller presses “5” on their keypad, your agent will receive this as text and can respond accordingly:
Sending DTMF Tones
The built-in playDtmfSounds
tool allows your AI agent to send DTMF tones, which is useful for navigating Interactive Voice Response (IVR) systems or other phone trees. To enable the tool, add it to the selectedTools
array when creating a call or call stage:
The playDtmfSounds
tool accepts a string parameter named digits
and works with the following tones: 0-9, *, #, A-D.
For example:
Note: the playDtmfSounds
tool uses an automatic parameter that sends the proper sample rate of the source audio and should be treated as an implementation detail.
Common Use Cases
- Building interactive phone trees or IVR systems
- Creating agents that can navigate existing phone systems
- Enabling quick responses through keypad input
- Collecting numeric input (e.g., account numbers, PIN codes)
- Building hybrid voice/keypad interfaces