Version: 2026.03.0-rc1 On this page
Phonexia Enhanced Speech to Text Built on Whisper changelog
[PLAT-1543] Unsupported translation language error when translation and transcription languages differ
[PLAT-1543] Language enforcement in translation mode
[PLAT-1471] Refactoring of the internal audio handling to reduce memory usage. Audio format support is now limited to WAV and FLAC .
[PLAT-1495] Whisper sometimes outputs segments that do not contain word segmentation, even when it's enabled.
[PLAT-1506] Fixed a rare issue where audio resampling could fail when processing large files, particularly when converting between certain sample rates (e.g., 11025 Hz to 8000 Hz).
[PLAT-1509] Fixed GPU processing failure on NVIDIA Blackwell architecture GPUs (e.g., RTX 1000 series) caused by a bug in ONNX Runtime.
Fixed license validation error handling for legacy (v1) license files with missing required fields. Previously, this could cause undefined behavior.
[PLAT-1425] Improve compatibility with future models - WARNING microservice requires at least model versions 1.3.0
Microservice now by default logs each request with incremental correlation id
[VOX-2666] Inconsistent logging
[PLAT-1213] Check whether model is translation capable.
[PLAT-1277] Possible memory leak on cuda device
[PLAT-1056] Support for new licenses tied only to model major version. Old licenses tied to precise model version are still supported.
[VOX-1293] phonexia.grpc.common.Status service
[VOX-1293] Field license_flags to LicensingInfoResult in phonexia.grpc.common.Licensing
[PLAT-1167] Support fine-tuned Whisper models
[PLAT-1103] Ungraceful cancellation when processing is cancelled by the user
[PLAT-980] Word-level segmentation
[PLAT-1070] Default port 8080 exposed in Dockerfile
[PLAT-978] Loading configuarion from model
[PLAT-389] Start and end request's messages are logged at the INFO level
[PLAT-743] Upgraded CTranslate2 to v4
[PLAT-936] Crash when using negative values in time range
[PLAT-888] Added PHX_KEEPALIVE_TIME_S and PHX_KEEPALIVE_TIMEOUT_S options for connection timeout specification
[PLAT-801] If set, start_time is added up to all segment timestamps
[PLAT-838] Possibility to set beam size on start
[PLAT-813] Language tags are case insensitive now
[PLAT-772] Support for sending RAW audio data
[PLAT-781] Support for license extensions
[PLAT-758] Device is logged as a number rather than a string
[PLAT-644] Support for distilled whisper v3 models
[PLAT-788] Missing entrypoint in GPU docker image
[PLAT-756] Inconsistent transcription behavior between autodetect and forced language mode
[PLAT-766] Misleading warning message in Python client
[PLAT-765] Python client is not compatible with Python 3.8
[PLAT-774] Optimized docker images
[PLAT-751] Crash when second request arrive while processing is running
[PLAT-748] Some licenses may be refused by the service
[PLAT-309] Machine translation
[PLAT-669] Support for custom log tags in metadata and logs
[VOX-667] Renamed microservice to enhanced-speech-to-text-built-on-whisper
[PLAT-465] Parameters for selecting the part of the audio to process in TranscribeRequest (audio.time_range)
[PLAT-581] Parameter for enabling language switching during the transcription in TranscribeRequest (config.enable_language_switching)
[PLAT-603] Premature processing cancellation if connection is canceled by client
[PLAT-625] Additional logging when initializing service
[PLAT-614] Inconsistent license logging
[PLAT-609] Memory access violations
[PLAT-521] Support for large-v3 Whisper model
[VOX-396] Limiting of supported languages for Whisper models
[PLAT-520] Correlation ID to log messages
[PLAT-439] Optimization to model loading
[PLAT-296] Phonexia Voice Activity Detection