Creating a Custom Voice
Currently, we support one cloned voice per account. If you need more cloned voices, please reach out.
Prerequisites
- An Ultravox Realtime API key
- A single audio file containing a clear voice sample (30 seconds recommended)
- The audio file must be in .mp3 or .wav format
Using the API
To create a custom voice, send a POST request to the/api/voices
endpoint with your audio file. Note: multiple files are not supported.
Here’s how to do it:
Requirements for Audio Samples
For optimal results, ensure your audio sample meets these criteria:- Clear, high-quality audio without background noise or echo
- Single speaker throughout the recording
- Natural speaking pace and tone
- No music or other voices in the background
- 30-60 seconds in length (longer samples do not typically lead to better clones)
Limitations
- Maximum of one audio file per voice
- 10MB file size maximum