Skip to main content
Version: 4.0.0-rc1

Emotion Recognition

Phonexia Emotion Recognition is an advanced technology designed to accurately determine what emotion is being expressed in a recording, that is, whether the speaker is happy, neutral, sad, or angry, regardless of the language spoken.

This page explains how to use Phonexia Emotion Recognition in our web application. If you want to dive deeper into the inner workings of this technology, check out our detailed technical documentation.

Uploading files

Upload your files or create your own recordings by using the built-in recording feature. If you don't have your own files, you can use the provided Phonexia examples to explore how the technology works.

Read more about uploading files here.

Results

After uploading, your recordings will appear in the left panel. Once processing is complete, the right panel will display the results for each recording as a radial bar chart, showing the ocurrence (in form of percentage) of each of the four emotions: happy, neutral, sad, angry. If the audio is in stereo, separate results will be shown for each channel.

Export formats

Once your results are ready, you can export them in various formats.

Emotion Recognition results can be exported individually for each file in CSV, XLSX, or JSON format. Each export file is named after the corresponding file and includes the channel number, along with the score for each emotion.

The same results can also be exported in bulk as a ZIP file. Additionally, users have the option to export a summary file that displays the scores for all the selected recordings.

XLSX

Table showing channel information and the respective scores for each emotion.

CSV

Channel,Happy (%),Neutral (%),Sad (%),Angry (%)
0,79.87,19.02,0.56,0.55

JSON

JSON-format results for audio include channel information, speech length, and scores for each emotion.

{
"channels": [
{
"channel_number": 0,
"speech_length": 43.59,
"scores": [
{
"emotion": "happy",
"probability": 0.7987
},
{
"emotion": "neutral",
"probability": 0.19022
},
{
"emotion": "sad",
"probability": 0.00562
},
{
"emotion": "angry",
"probability": 0.00545
}
]
}
]
}

All results

Whether exported as a CSV or XLSX file, the data includes the percentage occurrence of each emotion across all recordings.

Table showing filename, channel, and the respective scores for each emotion.