Speech Platform Release Notes

Version 3.1.0

July 30, 2024 · One min read

New and significantly easier way to install or update technology licenses - they are now loaded from a separate YAML file. Just unzip the licensed models package to the data disk and the system loads the licenses automatically.

⚠️ NOTE: The new license format is not compatible with previous versions, i.e. licenses used in earlier Virtual Appliance versions cannot be used in this version and a new license needs to be obtained from Phonexia.
Added Language Identification technology preview in GUI.
Added Speaker Diarization technology ([high-level description(/products/speech-platform-4/technologies/speaker-diarization)) preview in GUI.
Added possibility to run Voiceprint Extraction on GPU to significantly boost the extraction performance.
Virtual Appliance configuration changes are now automatically applied to corresponding components (no restarts needed anymore).
Lower memory consumption during processing due to internal components optimizations.

Included Components

May 15, 2024 · One min read

The Speech to Text Whisper Enhanced technology has been renamed to Enhanced Speech to Text Built on Whisper and a new Language switching feature was added. This feature identifies the predominant language spoken within each thirty-second interval of audio and the identified language is then utilized for transcribing that particular section.
For Speech to Text Phonexia and Time Analysis of Speech technologies it's now possible to configure the number of tasks to be processed in parallel. It is done using the paralelism parameter in the corresponding sections of Virtual Appliance configuration file.

Included Components

April 21, 2024 · One min read

Included Components

April 2, 2024 · One min read

Added Time Analysis of Speech technology (high-level description), available via REST API only (no GUI).
Configuration and administration changes:
- Added options to change tmpdir volume for speech-platform API and media-conversion.
- Added options to configure UI limits.
- Added option to change API log level.
- Models are now stored on data disk separately for each microservice.

Included Components

February 19, 2024 · One min read

Initial release with Speaker Identification ([high-level description(/products/speech-platform-4/technologies/speaker-identification)) and Speech To Text (high-level description) technologies available via REST API and in GUI. The Speech to Text Whisper Enhanced supports auto-detection of the language.

Included Components