Audio Quality Estimation
The Audio Quality Estimation technology enables users to analyze the perceptual clarity of a recording, delivering a specific metric known as the PESQ score estimation.
How it works
Uploading files
Upload your files or create your own recordings by using the built-in recording feature. If you don't have your own files, you can use the provided Phonexia examples to explore how Audio Quality Estimation works.
Read more about uploading files here.
Results
After uploading your recordings, they will appear in the left panel. Once processing is complete, the PESQ score estimation—a quick result—will be displayed on the right side of the left panel. The right panel provides a more detailed visualization of the result, showing the score as a specific position on a bar range.
What is PESQ score?
Perceptual Evaluation of Speech Quality (PESQ) is a standardized method for assessing audio quality by comparing the tested audio to a reference. It evaluates factors like sharpness, background noise, and clipping, providing a score from -0.5 to 4.5, where higher scores indicate better quality.
The Phonexia Audio Quality Estimation technology does not provide a strict PESQ score but instead generates a PESQ score estimation using a machine learning model. This approach eliminates the need for reference audio, offering a more flexible and efficient method for evaluating audio quality.
Export formats
Once your results are ready, you can export them in various formats.
Audio Quality Estimation results can be exported individually for each file in CSV, XLSX, or JSON format. Each export file is named after the corresponding audio file and includes the channel number as well as the PESQ score estimation.
The same results can also be exported in bulk as a ZIP file. Additionally, users have the option to export a summary file that displays the scores for all the selected recordings.
XLSX format
CSV format
Channel,PESQ Score
0,3.27
1,3.51
JSON format
In addition to the PESQ score and channel number, the JSON format also includes detailed information about the recording, such as its length, sampling rate, and other relevant characteristics.
{
"channels": [
{
"channel_number": 0,
"pesq_estimation": 3.2696213722229004,
"signal_noise_ratio": 100,
"audio_length": 121.92500305175781,
"max_amplitude": 1,
"min_amplitude": -0.8483535051345825,
"peak_amplitude": 1,
"mean_amplitude": 0.00009235431207343936,
"sampling_rate": 8000
},
{
"channel_number": 1,
"pesq_estimation": 3.5104048252105713,
"signal_noise_ratio": 100,
"audio_length": 121.92500305175781,
"max_amplitude": 0.364848792552948,
"min_amplitude": -0.5544297695159912,
"peak_amplitude": 0.5544297695159912,
"mean_amplitude": 0.00014280321192927659,
"sampling_rate": 8000
}
]
}
All results
Whether in CSV or XLSX format, the export file displays the PESQ score for each selected recording.
Filename,Channel,PESQ Score
Barbara.mp3,0,3.22
Friedrich.wav,0,3.00
Harry.wav,0,1.69
Kathryn_Paula.wav,0,3.27
Kathryn_Paula.wav,1,3.51
Luka.wav,0,4.23
Veronika_Harry.wav,0,3.11