Handling Background Speakers

Ultravox Realtime provides built-in support for handling background speakers and multi-speaker environments. The system is designed to focus on the primary speaker while filtering out cross-talk and unwanted voice interactions.

Automatic FilteringBackground speaker filtering is enabled by default for all Ultravox calls. This helps your AI agent focus on the intended speaker even in challenging multi-speaker scenarios.

Addressing a Complex Challenge

Multi-speaker environments present unique difficulties for voice AI systems:

Speaker phone scenarios where multiple voices may be muffled or distant
Cross-talk situations with overlapping conversations
Background conversations that shouldn’t trigger the AI agent

Advanced Speaker Detection

Ultravox employs sophisticated techniques to handle these challenging scenarios: Model Training → The Ultravox model distinguishes between speech and noise/unintelligible speech. Speaker Tracking → Advanced algorithms analyze voical power levels and patterns to identify the primary speaker and filter out background voices. Real-time Processing → All speaker detection and filtering happens in real-time without adding latency to the conversation. The result is cleaner voice interactions where your AI agent responds to the intended speaker, reducing confusion and improving conversation quality in complex acoustic environments.

Handling Background NoiseBuilt-in robust noise handling to keep your calls on track and fast.

⌘I

Getting Started

Agents & Calls

Telephony

Web, Apps, Websockets

Tools

Voices

Webhooks

Noise & VAD

Handling Background Speakers

Addressing a Complex Challenge

Advanced Speaker Detection

​Addressing a Complex Challenge

​Advanced Speaker Detection

Addressing a Complex Challenge

Advanced Speaker Detection