Phonexia Enhanced Speech to Text Built on Whisper changelog
1.6.0 (2024-09-12)
Added
- [PLAT-888] Added
PHX_KEEPALIVE_TIME_S
andPHX_KEEPALIVE_TIMEOUT_S
options for connection timeout specification
1.5.0 (2024-08-05)
Changed
- [PLAT-801] If set,
start_time
is added up to all segment timestamps - [PLAT-838] Possibility to set beam size on start
- [PLAT-813] Language tags are case insensitive now
1.4.0 (2024-07-11)
Added
- [PLAT-772] Support for sending RAW audio data
- [PLAT-781] Support for license extensions
Fixed
- [PLAT-758] Device is logged as a number rather than a string
1.3.0 (2024-06-25)
Added
- [PLAT-644] Support for distilled whisper v3 models
1.2.5 (2024-06-06)
Fixed
- [PLAT-788] Missing entrypoint in GPU docker image
1.2.4 (2024-05-30)
Fixed
- [PLAT-756] Inconsistent transcription behavior between autodetect and forced language mode
- [PLAT-766] Misleading warning message in Python client
- [PLAT-765] Python client is not compatible with Python 3.8
- [PLAT-774] Optimized docker images
1.2.3 (2024-05-13)
Fixed
- [PLAT-751] Crash when second request arrive while processing is running
1.2.2 (2024-04-30)
Fixed
- [PLAT-748] Some licenses may be refused by the service
1.2.1 (2024-04-29)
Fixed
- Missing package in pypi
1.2.0 (2024-04-25)
Added
- [PLAT-309] Machine translation
- [PLAT-669] Support for custom log tags in metadata and logs
Changed
- [VOX-667] Renamed microservice to enhanced-speech-to-text-built-on-whisper
1.1.0 (2024-02-21)
Added
- [PLAT-465] Parameters for selecting the part of the audio to process in
TranscribeRequest
(audio.time_range
) - [PLAT-581] Parameter for enabling language switching during the transcription in
TranscribeRequest
(config.enable_language_switching
) - [PLAT-603] Premature processing cancellation if connection is canceled by client
- [PLAT-625] Additional logging when initializing service
Fixed
- [PLAT-614] Inconsistent license logging
1.0.1 (2024-02-07)
Fixed
- [PLAT-609] Memory access violations
1.0.0 (2024-01-22)
Added
- [PLAT-521] Support for large-v3 Whisper model
- [VOX-396] Limiting of supported languages for Whisper models
- [PLAT-520] Correlation ID to log messages
- [PLAT-439] Optimization to model loading
- [PLAT-296] Phonexia Voice Activity Detection