Speech to Text – Whisper Enhanced
Phonexia Speech to Text Whisper Enhanced converts speech in audio signals into plain text.
To provide to our partners broader portfolio of languages for converting speech into text, we have worked in Phonexia on integrating the open-source transcription technology Whisper.
To improve the provided open-source code we have made several adjustments that
increased the performance, speed of transcription, decreased the troublesome
behaviour and made the speech to text more stable.
Still, it has to be kept in mind that the resulting quality and precision of the
transcript depends on the quality of the audio files.
Supported languages
Features
- Autodetect language spoken in audio
- Language Switching - for each 30s block of audio is detected language and transcription assigned appropriate language
Performance
For proper performance it is important to run the Whisper Enhanced on computation grade Graphical cards (GPU).