Voiceprint Comparison
Phonexia voiceprint-comparison is a tool for comparing voiceprints obtained from audio recordings using Phonexia voiceprint-extraction. To learn more, visit this page.
Versioning
We use SemVer for versioning.
Quick reference
- Maintained by Phonexia
- Contact us via e-mail, or using Phonexia Service Desk
- File an issue
- See list of licenses
- See terms of use
How to use this image
Getting the image
You can easily obtain the docker image from Docker Hub. Just run:
docker pull phonexia/voiceprint-comparison:latest
Running the image
You can start the microservice and list all the supported options by running:
docker run --rm -it phonexia/voiceprint-comparison:latest --help
The output should look like this:
Usage: voiceprint-comparison [OPTIONS]
Options:
-h,--help Print this help message and exit
-m,--model file/dir REQUIRED (Env:PHX_MODEL_PATH)
Path to model file or directory.
-k,--license_key string REQUIRED (Env:PHX_LICENSE_KEY)
License key.
-a,--listening_address address [[::]] (Env:PHX_LISTENING_ADDRESS)
Address on which the server will be listening. Address '[::]' also accepts IPv4 connections.
-p,--port number [8080] (Env:PHX_PORT)
Port on which the server will be listening.
-l,--log_level level [info] (Env:PHX_LOG_LEVEL)
Logging level. Possible values: error, warning, info, debug, trace.
Note that the ***model and license_key options are required. To obtain the model and license, contact Phonexia.
You can specify the options either via command line arguments or via environmental variables.
Run the container with the mandatory parameters:
docker run --rm -it -v ${absolute-path-to-models}:/models phonexia/voiceprint-comparison:latest --model /models/${model} --license_key ${license-key}
Replace the absolute-path-to-models
, model
and license-key
with the corresponding values.
With this command, the container will start, and the microservice will be listening on port 8080 on localhost.
Microservice communication
gRPC API
For communication, our microservices use gRPC, which is a high-performance, open-source Remote
Procedure Call (RPC
) framework that enables efficient communication between distributed systems using a variety of programming languages. We use an interface definition language to specify a common interface and contracts between components. This is primarily achieved by specifying methods with parameters and return types.
Take a look at our gRPC API documentation. The voiceprint-comparison microservice defines a VoiceprintComparison
service with a remote procedure called Compare
. This procedure performs M to N
comparison of voiceprints. It accepts an argument (also referred to as a "message") called CompareRequest
, which contains two arrays of voiceprints for the comparison called voiceprints_a
and voiceprints_b
. This CompareRequest
argument is streamed, meaning that it may be received in multiple requests, each containing some of the voiceprints. Once all requests have been received and processed, the Compare
procedure returns a message called CompareResponse
, which consists of the resulting Matrix
of scores.
Connecting to microservice
There are multiple ways how you can communicate with our microservices.
Using generated library
The most common way how to communicate with the microservices is via a programming language using a generated library.
Python library
If you use Python as your programming language, you can use our official gRPC Python library.
To get this library, simply run:
pip install phonexia-grpc
You can then import:
- Specific libraries for each microservice that provide the message wrappers.
- Stubs for the
gRPC
clients.
# phx_core contains classes common for multiple microservices like `Voiceprint`.
import phonexia.grpc.common.core_pb2 as phx_core
# speaker_identification_pb2 contains `CompareRequest` and `CompareResponse`.
import phonexia.grpc.technologies.speaker_identification.v1.speaker_identification_pb2 as sid
# speaker_identification_pb2_grpc contains `VoiceprintComparisonStub` needed to make requests.
import phonexia.grpc.technologies.speaker_identification.v1.speaker_identification_pb2_grpc as sid_grpc
Generate library for programming language of your choice
For the definition of microservice interfaces, we use the standard way of protocol buffers. The services
, together with the procedures
and messages
that they expose, are defined in the so-called proto
files.
The .proto
files can be used to generate client libraries in many programming languages. Take a look at protobuf tutorials to get started with generating the library in the languages of your choice using the protoc
tool.
You can find the proto
files developed by Phonexia in this repository.
Using existing clients
Phonexia Python client
The easiest way to get started with testing is to use our simple Python client. To get it, run:
pip install phonexia-voiceprint-comparison-client
After the successful installation, run the following command to see the client options:
voiceprint_comparison_client --help
grpcurl client
If you need a simple tool for testing the microservice on the command line, you can use grpcurl. This tool can serialize and send a request for you, if you provide the request body in JSON format and specify the endpoint.
You need to make sure that the audio content in the body is encoded in Base64
. Unfortunately, you need to do this manually as grpcurl
cannot do this for you.
{
"voiceprints_a": [
{
"content": "${voiceprint_1}"
},
...
],
"voiceprints_b": [
{
"content": "${voiceprint_2}"
},
...
]
}
For this, you already need to have extracted voiceprints
. If you don't have any yet, you can obtain some using the Phonexia voiceprint-extraction
microservice.
In the request, there may be multiple voiceprints in the voiceprints_a
as well as in the voiceprints_b
arrays.
Get such a request body with the following command:
echo -n '{"voiceprints_a": [{"content": "'$(cat ${path_to_voiceprint_1})'"}],"voiceprints_b": [{"content": "'$(cat ${path_to_voiceprint_2})'"}]}' > ${path_to_body}
Replace path_to_voiceprint_1
, path_to_voiceprint_2
, and path_to_body
with the corresponding values.
Now you can make the request. The microservice supports reflection, meaning that you don't need to know the API in advance to make a request.
grpcurl -plaintext -use-reflection -d @ localhost:8080 phonexia.grpc.technologies.speaker_identification.v1.VoiceprintComparison/Compare < ${path_to_body}
The grpcurl
automatically serializes the response to this request into JSON
including the Matrix
with the scores.
GUI clients
If you'd prefer to use a GUI client like Postman to test the microservice, take a look at the GUI Client page in our documentation. Note that you will still need to convert the audio into the Base64
format manually as those tools do not support it by default either.