What is User Configuration File and How To Use It
Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to fine tune behavior of the technologies to adapt to the nature of their audio data.
Modifying original BSAPI configuration files directly can be dangerous –
inappropriate changes may cause unpredictable behavior and without having a
backup of the unmodified file it's difficult to restore the working state.
User configuration files provide a way to override processing parameters without
modifying original BSAPI configuration files.
User configuration file is a plain text file with the same name as a main
configuration file, with the additional extension .usr
. For example:
Main configuration file name | User configuration file name |
---|---|
stt_cs_cz_5_online.bs | stt_cs_cz_5_online.bs.usr |
kws_nl_nl_5.bs | kws_nl_nl_5.bs.usr |
phnrec_pashto.bs | phnrec_pashto.bs.usr |
vpextract4_xl4.bs | vpextract4_xl4.bs.usr |
During technology initialization (e.g. during Speech Engine startup), the initialization routine checks for the existence of such user config file. If found, it's automatically loaded after loading the main configuration file and the settings from the user config is automatically applied over the settings from main configuration file.
Usage example:
When using Czech STT on realtime streams, the results show that system outputs end of segment too often, i.e. longer pauses between words made by the speakers are misidentified as end of sentence, while in fact the speakers actually continue to speak. So it is desired to fine tune the system to accept longer delay between words without ending a sentence.
So, following the
How to configure STT realtime stream word detection parameters
article, we create a stt_cs_cz_5_online.bs.usr
text file along the original
stt_cs_cz_5_online.bs
configuration file in
<SPE directory>/bsapi/stt/settings
directory and put the following lines in it
(changing the forward extension parameter from default 750 to 1500):
[vad.online_segmenter:SOnlineVoiceActivitySegmenterI]
forward_extensions_length_ms=1500
Then after restarting SPE – and optionally checking in SPE log that user
configuration file stt_cs_cz_5_online.bs.usr
was really loaded (this
information is available at the 'trace' logging level only) – the STT results
should show end of segment less frequently.