Version: 2.0.0

Speech to Text – Whisper Enhanced

Phonexia Speech to Text Whisper Enhanced converts speech in audio signals into plain text.

To provide to our partners broader portfolio of languages for converting speech into text, we have worked in Phonexia on integrating the open-source transcription technology Whisper.

To improve the provided open-source code we have made several adjustments that increased the performance, speed of transcription, decreased the troublesome behaviour and made the speech to text more stable.
Still, it has to be kept in mind that the resulting quality and precision of the transcript depends on the quality of the audio files.

Supported languages

See page Supported languages

Features

Autodetect language spoken in audio
Language Switching - for each 30s block of audio is detected language and transcription assigned appropriate language

Performance

For proper performance it is important to run the Whisper Enhanced on computation grade Graphical cards (GPU).

Features​

Performance​

Features

Performance