Version: 4.0.0

Speaker Identification

Phonexia Speaker Identification uses the power of voice biometrics to recognize speakers by their voice, i.e. to decide whether the voice in two recordings belongs to the same person or two different people. To facilitate quick and intuitive visualization of comparison results, we have developed a user-friendly graphical user interface (GUI).

This page explains how to use Phonexia Speaker Identification in our web application. If you want to dive deeper into the inner workings of this technology, check out our detailed technical documentation.

Input data

Phonexia Speaker Identification supports traditional audio formats as well as voiceprints, facilitating comprehensive voice comparisons. For further information on voiceprints, you can read this article or watch videos available in the "How it works" section located above the cards.

How it works

The Speaker Identification GUI consists of two cards, left, and right (or upper and lower in the mobile version). The first step is to upload recording(s) of known speakers, i.e. speakers with confirmed identity in the left card. Conversely, unknown speakers can be uploaded in the other card. The recordings in the left card are then compared to the recordings in the right card to verify whether any of the identified speakers can be found among the unknown audios.

info

The cards are interchangeable, meaning it is not relevant on which side the database of known voices and the database of unknown voices will be added.

Understanding the results

Similarity scores

Once all recordings are uploaded to their respective cards, they are converted into voiceprints for comparison. The results of the comparison are promptly displayed.

Selecting a recording in either card (right or left) will reveal similarity scores in the other. Both color and value denote the likelihood of a voice match, with a higher score indicating higher match chances. Green signifies a match, yellow suggests a possible match, and red indicates no match. If the color appears white, it signifies that the technology did not recognize any speech segments in the audio.

info

To quickly familiarize yourself with the user interface, you can watch a short product tour. It is automatically presented to every first-time user, and you can access it at any time in the "How it Works" section located above the cards. This tour briefly explains the upload process and clarifies how to interpret the results.

Both recordings and voiceprints can be downloaded from the cards at any time.

Setting the threshold

In the default configuration, a similarity score below -1 is considered a non-match, and a score above 3 is considered a match. This setting can be adjusted at any time (see screenshot below). By lowering the threshold for acceptance, the user reduces the likelihood of a false rejection but simultaneously increases the risk of a false acceptance.

Export formats

Once the results are available, the user can choose between exporting the results as simple comparisons.

Table containing each recording A and B and their similarity scores.

Or as comprehensive comparison matrices. These export options are available in both .xlsx and .csv formats, ensuring compatibility with various analytical tools and platforms.

Table showing comparison matrix of known and unknown speakers and their voice similarity scores.

Input data​

How it works​

Understanding the results​

Similarity scores​

Setting the threshold​

Export formats​