Skip to main content
Version: 2026.03.0-rc1

Grapheme To Phoneme Conversion

Phonexia grapheme-to-phoneme-conversion is a tool for converting word spellings (graphemes) to phonetic pronunciations (phonemes). This tool is usefull in combination with keyword spotting technology, that allows users to specify pronunciations for specific keywords. To learn more, visit the Keyword spotting's home page.

Installation

Getting grapheme-to-phoneme-conversion docker image

You can easily obtain the grapheme-to-phoneme-conversion docker image from docker hub. Just run:

docker pull phonexia/grapheme-to-phoneme-conversion:latest

Running the image

Docker

You can start the microservice and list all the supported options by running:

docker run --rm -it phonexia/grapheme-to-phoneme-conversion:latest --help

The output should look like this:

grapheme_to_phoneme_conversion 1.0.0 


grapheme_to_phoneme_conversion [OPTIONS]


OPTIONS:
-h, --help Print this help message and exit
--version Display program version information and exit
-m, --model file REQUIRED (Env:PHX_MODEL_PATH)
Path to a model file.
-a, --listening_address address [[::]] (Env:PHX_LISTENING_ADDRESS)
Address on which the server will be listening. Address '[::]'
also accepts IPv4 connections.
-p, --port number [8080] (Env:PHX_PORT)
Port on which the server will be listening.
-l, --log_level level:{error,warning,info,debug,trace} [info] (Env:PHX_LOG_LEVEL)
Logging level. Possible values: error, warning, info, debug,
trace.
--keepalive_time_s number:[0, max_int] [60] (Env:PHX_KEEPALIVE_TIME_S)
Time between 2 consecutive keep-alive messages, that are sent if
there is no activity from the client. If set to 0, the default
gRPC configuration (2hr) will be set (note, that this may get the
microservice into unresponsive state).
--keepalive_timeout_s number:[1, max int] [20] (Env:PHX_KEEPALIVE_TIMEOUT_S)
Time to wait for keep alive acknowledgement until the connection
is dropped by the server.
--num_instances_per_device NUM:UINT > 0 (Env:PHX_NUM_INSTANCES_PER_DEVICE)
Number of instances. Microservice can process requests
concurrently if value is >1.
note

The model option is required. To obtain the model, contact Phonexia.

You can specify the options either via command line arguments or via environmental variables.

Run the container with the mandatory parameters:

docker run --rm -it -v /opt/phx/models:/models -p 8080:8080 /phonexia/denoiser:latest --model /models/denoiser/generic-1.2.0.model

The path /opt/phx/models is the location where models sent by Phonexia are stored on the host system. This directory is mounted as a volume to /models inside the Docker container to make the models accessible to the microservice. Replace the /opt/phx/models and generic-1.2.0.model with the corresponding values.

With this command, the container will start, and the microservice will be listening on port 8080 on localhost.

Microservice communication

gRPC API

For communication, our microservices use gRPC, which is a high-performance, open-source Remote Procedure Call (RPC) framework that enables efficient communication between distributed systems using a variety of programming languages. We use an interface definition language to specify a common interface and contracts between components. This is primarily achieved by specifying methods with parameters and return types.

Take a look at our gRPC API documentation. The grapheme-to-phoneme-conversion microservice defines a GraphemeToPhonemeConversion service with remote procedure called Convert. This procedure accepts an argument (also referred to as "message") called ConvertRequest, which contains list of word spellings in grapheme form.

The Convert procedure returns a message called ConvertResponse which consists of list of word pronunciations in phoneme form.

Connecting to microservice

There are multiple ways how you can communicate with our microservices.

Phonexia Python client

The easiest way to get started with testing is to use our simple Python client. To get it, run:

pip install phonexia-grapheme-to-phoneme-conversion-client

After the successful installation, run the following command to see the client options:

grapheme_to_phoneme_conversion_client --help

Versioning

We use Semantic Versioning.