Phoneme Recogniser

Phonexia Phoneme Recognizer (PHNREC) converts speech signals into pronunciation characters (phonemes). After conversion, the pronunciation (text) can be easily indexed and searched by third-party text data mining tools.

The technology is optimized for noisy recordings and colloquial speech. The Phoneme Recognizer is delivered as part of the Keyword Spotting (KWS) technology but can also be used independently of KWS.

Typical use cases

"Search-in-speech" - searching for specific information in large call archives (e.g., claims inspection).
Obtaining a custom pronunciation of a word or phrase to use as a customized keyword in keyword spotting technology (improving KWS technology accuracy).
Obtaining a custom pronunciation of a word to add to the language model of speech-to-text technology.

Input

Audio file; streaming is not supported.
Technology model name (i.e., language code) to be used for phoneme transcription.

Output

Example

Input: "Hi, this is Lewis." (WAV file containing speech)

Output: sil hh ay dh ow s ih s l uw uw th sil (plain-text or XML/JSON format)

Note: The output may contain the following special tokens:

sil: Silent part (or no speech detected).

The list of phonemes is available in the document phonemes_for_stt_and_kws.pdf (provided in the manuals for SPE, STT, or KWS).

Supported languages

The list of supported languages for the Phoneme Recognizer is the same as for Keyword Spotting.

Typical use cases​

Input​

Output​

Example​

Typical use cases

Input

Output

Example