Voice Activity Detection
Voice Activity Detection is a language-, domain-, and channel-independent technology that identifies speech content versus non-speech content within audio recordings. It labels speech and non-speech parts of the recording, which can serve as a decision point for whether to process the recording using other technologies. VAD is commonly used in rapid filtration processes during deployment.
Typical use cases include:
- Detecting the presence or absence of human speech for voice processing.
- Filtering out non-speech parts of a recording.
- Excluding recordings that lack sufficient net speech for further processing by other technologies.
- Activating voice-driven processes, etc.
The processing speed of Voice Activity Detection is 140 ftRT per instance.