Skip to main content
Version: 4.0.0-rc1

Transcription Normalization

Phonexia transcription-normalization is a tool for normalization of the transcription of Phonexia speech-to-text 6th generation. The microservice takes transcription as an input and returns the normalized transcription, including converted numbers, dates, etc. in textual form to numeric (e.g. twenty-four -> 24), and/or added punctuation and characters capitalization. To learn more, visit the technology's home page.

Installation

Getting Transcription Normalization docker image

You can easily obtain the Transcription Normalization docker image from docker hub. Just run:

docker pull phonexia/transcription-normalization:latest

Running the image

Docker

You can start the microservice and list all the supported options by running:

docker run --rm -it phonexia/transcription-normalization:latest --help

The output should look like this:

Usage: transcription-normalization [OPTIONS]

You can use environment variables in format PHX_<OPTION_NAME> instead of
command line arguments.

Options:
-m, --model PATH Path to a model file. [required]
-l, --log_level [fatal|error|warning|info|debug]
Logging level.
--log_format [human|json] Logging format.
-a, --listening_address TEXT Address where the server will listen. The
address '[::]' also accepts IPv4
connections.
-p, --port INTEGER RANGE Port on which the server will be listening.
[1<=x<=65535]
--help Show this message and exit.
note

The model and license_key options are required. To obtain the model and license, contact Phonexia.

You can specify the options either via command line arguments or via environmental variables.

Run the container with the mandatory parameters:

docker run --rm -it -v /opt/phx/models:/models -p 8080:8080 /phonexia/transcription-normalization:latest --model /models/transcription_normalization-generic-1.0.0.model --license_key ${license-key}

Replace the /opt/phx/models, transcription_normalization-generic-1.0.0.model and license-key with the corresponding values.

With this command, the container will start, and the microservice will be listening on port 8080 on localhost.

Microservice communication

gRPC API

For communication, our microservices use gRPC, which is a high-performance, open-source Remote Procedure Call (RPC) framework that enables efficient communication between distributed systems using a variety of programming languages. We use an interface definition language to specify a common interface and contracts between components. This is primarily achieved by specifying methods with parameters and return types.

Take a look at our gRPC API documentation. The transcription-normalization microservice defines a TranscriptionNormalization service with remote procedure called Normalize. This procedure normalizes the text segments. It accepts an argument (also referred to as "message") called NormalizeRequest containing array of segments. This NormalizeRequest argument is streamed, meaning that it may be received in multiple requests, each containing some of the segments. Once all requests have been received and processed, the Normalize procedure returns a message called NormalizeResponse which consists of the normalized segments.

Connecting to microservice

There are multiple ways how you can communicate with our microservices.

Phonexia Python client

The easiest way to get started with testing is to use our simple Python client. To get it, run:

pip install phonexia-transcription-normalization-client

After the successful installation, run the following command to see the client options:

transcription_normalization_client --help

Versioning

We use Semantic Versioning.