What is User Configuration File and How To Use It

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to fine tune behavior of the technologies to adapt to the nature of their audio data.

Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredictable behavior and without having a backup of the unmodified file it's difficult to restore the working state.
User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files.

Inappropriate configuration changes may cause serious issues!

User configuration file is a plain text file with the same name as a main configuration file, with the additional extension .usr. For example:

Main configuration file name	User configuration file name
`stt_cs_cz_5_online.bs`	`stt_cs_cz_5_online.bs.usr`
`kws_nl_nl_5.bs`	`kws_nl_nl_5.bs.usr`
`phnrec_pashto.bs`	`phnrec_pashto.bs.usr`
`vpextract4_xl4.bs`	`vpextract4_xl4.bs.usr`

During technology initialization (e.g. during Speech Engine startup), the initialization routine checks for the existence of such user config file. If found, it's automatically loaded after loading the main configuration file and the settings from the user config is automatically applied over the settings from main configuration file.

Usage example:

When using Czech STT on realtime streams, the results show that system outputs end of segment too often, i.e. longer pauses between words made by the speakers are misidentified as end of sentence, while in fact the speakers actually continue to speak. So it is desired to fine tune the system to accept longer delay between words without ending a sentence.

So, following the How to configure STT realtime stream word detection parameters article, we create a stt_cs_cz_5_online.bs.usr text file along the original stt_cs_cz_5_online.bs configuration file in <SPE directory>/bsapi/stt/settings directory and put the following lines in it (changing the forward extension parameter from default 750 to 1500):

[vad.online_segmenter:SOnlineVoiceActivitySegmenterI]
forward_extensions_length_ms=1500

Then after restarting SPE – and optionally checking in SPE log that user configuration file stt_cs_cz_5_online.bs.usr was really loaded (this information is available at the 'trace' logging level only) – the STT results should show end of segment less frequently.