Skip to main content

Version: 3.3.0

Phonexia Enhanced Speech to Text Built on Whisper changelog

1.6.0 (2024-09-12)

Added

[PLAT-888] Added PHX_KEEPALIVE_TIME_S and PHX_KEEPALIVE_TIMEOUT_S options for connection timeout specification

1.5.0 (2024-08-05)

Changed

[PLAT-801] If set, start_time is added up to all segment timestamps
[PLAT-838] Possibility to set beam size on start
[PLAT-813] Language tags are case insensitive now

1.4.0 (2024-07-11)

Added

[PLAT-772] Support for sending RAW audio data
[PLAT-781] Support for license extensions

Fixed

[PLAT-758] Device is logged as a number rather than a string

1.3.0 (2024-06-25)

Added

[PLAT-644] Support for distilled whisper v3 models

1.2.5 (2024-06-06)

Fixed

[PLAT-788] Missing entrypoint in GPU docker image

1.2.4 (2024-05-30)

Fixed

[PLAT-756] Inconsistent transcription behavior between autodetect and forced language mode
[PLAT-766] Misleading warning message in Python client
[PLAT-765] Python client is not compatible with Python 3.8
[PLAT-774] Optimized docker images

1.2.3 (2024-05-13)

Fixed

[PLAT-751] Crash when second request arrive while processing is running

1.2.2 (2024-04-30)

Fixed

[PLAT-748] Some licenses may be refused by the service

1.2.1 (2024-04-29)

Fixed

Missing package in pypi

1.2.0 (2024-04-25)

Added

[PLAT-309] Machine translation
[PLAT-669] Support for custom log tags in metadata and logs

Changed

[VOX-667] Renamed microservice to enhanced-speech-to-text-built-on-whisper

1.1.0 (2024-02-21)

Added

[PLAT-465] Parameters for selecting the part of the audio to process in TranscribeRequest (audio.time_range)
[PLAT-581] Parameter for enabling language switching during the transcription in TranscribeRequest (config.enable_language_switching)
[PLAT-603] Premature processing cancellation if connection is canceled by client
[PLAT-625] Additional logging when initializing service

Fixed

[PLAT-614] Inconsistent license logging

1.0.1 (2024-02-07)

Fixed

[PLAT-609] Memory access violations

1.0.0 (2024-01-22)

Added

[PLAT-521] Support for large-v3 Whisper model
[VOX-396] Limiting of supported languages for Whisper models
[PLAT-520] Correlation ID to log messages
[PLAT-439] Optimization to model loading
[PLAT-296] Phonexia Voice Activity Detection