Denoiser
The denoiser technology is designed to suppress noise in audio recordings. By leveraging a combination of traditional signal processing and deep learning, it provides a real-time noise suppression system that is efficient, lightweight, and capable of running on standard hardware without the need for specialized GPUs or other hardware accelerators.
Noise Suppression
The primary goal of noise suppression is to remove unwanted noise from an audio while minimally distorting the original signal.
Conventional noise suppression algorithms typically operate in the spectral domain. They estimate the noise spectrum using handcrafted filters and subtract it from the noisy audio spectrum. While effective, these methods often require manual tuning of noise estimation parameters, limiting their adaptability to various noise conditions.
Deep neural networks offer a more powerful approach. These models can be trained to directly process the raw audio signal, learning to identify and remove noise patterns without explicit spectral analysis. This end-to-end approach has the potential for superior performance but often requires extensive training data and computational resources. Additionally, many deep learning models are not suitable for real-time applications due to their computational complexity.
Our denoiser uses a hybrid approach, combining elements of both methods. It utilizes a compact deep learning model to estimate noise characteristics in the spectral domain, bypassing the need for manual tuning. This hybrid approach allows for robust noise reduction across a wide range of noise types and levels while maintaining low computational cost.
Use Cases and Applications
The denoiser is particularly effective at enhancing listening comfort. However, it does not necessarily improve speech intelligibility, as the human brain is already skilled at distinguishing speech from background noise. In some cases, denoising may even reduce intelligibility if it removes useful audio information.
Similarly, preprocessing noisy inputs for other speech technologies, such as speech recognition or speaker identification, may not yield improved performance, as these systems are often trained to handle noisy data. Removing noise from these inputs can sometimes degrade performance by discarding information that the models could otherwise use.
Despite this, the Denoiser remains valuable in scenarios such as:
- Videoconferencing: In settings where multiple speakers' inputs are mixed together, noise suppression prevents the accumulation of background noises from inactive speakers, thereby improving both audio quality and intelligibility for the active speaker.
- Audio conversion: When lowering the bitrate of an audio or using certain codecs for compression, denoising the audio beforehand can result in a clearer, higher-quality output, as compression methods generally degrade noisy audio more severely than clean audio.
Examples
The following examples demonstrate how the Denoiser can suppress various types of background noise in audio recordings. All noisy audios were created by adding noise to the following clean recording, which shows the baseline audio quality:
Original Audio
Results
Noise Type | Noisy Audio | Denoised Audio |
---|---|---|
Train | ||
Sea waves | ||
Thunderstorm | ||
Insects |