Version: 5.0.0

Audio Quality Estimation

This guide demonstrates how to perform Audio Quality Estimation with Phonexia Speech Platform 4. You can find a high-level description in the Audio Quality Estimation article.

For testing, we'll be using two media files. You can download them together in the audio_files.zip archive.

At the end of this guide, you'll find the full Python code example that combines all the steps that will first be discussed separately. This guide should give you a comprehensive understanding on how to integrate Audio Quality Estimation in your own projects.

Prerequisites

Follow the prerequisites for setup of Virtual Appliance and Python environment as described in the Task lifecycle code examples.

Run Audio Quality Estimation

To run Audio Quality Estimation for a single media file, you should start by sending a POST request to the /api/technology/audio-quality-estimation endpoint. file is the only mandatory parameter. In Python, you can do this as follows:

import requests

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000"  # Replace with your address
MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/audio-quality-estimation"

media_file = "Kathryn_Paula.wav"

with open(media_file, mode="rb") as file:
    files = {"file": file}
    start_task_response = requests.post(
        url=MEDIA_FILE_BASED_ENDPOINT_URL,
        files=files,
    )
print(start_task_response.status_code)  # Should print '202'

If the task has been successfully accepted, the 202 code will be returned together with a unique task ID in the response body. The task isn't processed immediately, but only scheduled for processing. You can check the current task status by polling for the result.

Polling

To obtain the final result, periodically query the task status until the task state changes to done, failed or rejected. The general polling procedure is described in detail in the Task lifecycle code examples.

Result for Audio Quality Estimation

The result field of the task contains estimated audio quality for each channel in the channels list. The length of the list corresponds to the number of channels in the input media file.

For our sample data, the task result should look as follows:

{
  "task": {
    "task_id": "2c031e72-374e-4c7b-9315-5ca05404dd89",
    "state": "done"
  },
  "result": {
    "channels": [
      {
        "channel_number": 0,
        "pesq_estimation": 3.3294897079467773,
        "signal_noise_ratio": 100.0,
        "audio_length": 121.92500305175781,
        "max_amplitude": 0.5001373291015625,
        "min_amplitude": -0.42408519983291626,
        "peak_amplitude": 0.5001373291015625,
        "mean_amplitude": 0.00011758386972360313,
        "sampling_rate": 8000
      }
    ]
  }
}

Full Python code

Here is the full example on how to run the Audio Quality Estimation technology. The code is slightly adjusted and wrapped into functions for better readability. Refer to the Task lifecycle code examples for a generic code template, applicable to all technologies.

import json
import requests
import time

VIRTUAL_APPLIANCE_ADDRESS = "http://<virtual-appliance-address>:8000"  # Replace with your address

MEDIA_FILE_BASED_ENDPOINT_URL = f"{VIRTUAL_APPLIANCE_ADDRESS}/api/technology/audio-quality-estimation"


def poll_result(polling_url, polling_interval=5):
    """Poll the task endpoint until processing completes."""
    while True:
        polling_task_response = requests.get(polling_url)
        polling_task_response.raise_for_status()
        polling_task_response_json = polling_task_response.json()
        task_state = polling_task_response_json["task"]["state"]
        if task_state in {"done", "failed", "rejected"}:
            break
        time.sleep(polling_interval)
    return polling_task_response


def run_media_based_task(media_file, params=None, config=None):
    """Create a media-based task and wait for results."""
    if params is None:
        params = {}
    if config is None:
        config = {}

    with open(media_file, mode="rb") as file:
        files = {"file": file}
        start_task_response = requests.post(
            url=MEDIA_FILE_BASED_ENDPOINT_URL,
            files=files,
            params=params,
            data={"config": json.dumps(config)},
        )
        start_task_response.raise_for_status()
    polling_url = start_task_response.headers["Location"]
    task_result = poll_result(polling_url)
    return task_result.json()


# Run Audio Quality Estimation
media_files = ["Laura_Harry_Veronika.wav", "Kathryn_Paula.wav"]

for media_file in media_files:
    print(f"Running Audio Quality Estimation for file {media_file}.")
    media_file_based_task = run_media_based_task(media_file)
    media_file_based_task_result = media_file_based_task["result"]
    print(json.dumps(media_file_based_task_result, indent=2))

Prerequisites​

Run Audio Quality Estimation​

Polling​

Result for Audio Quality Estimation​

Full Python code​

Prerequisites

Run Audio Quality Estimation

Polling

Result for Audio Quality Estimation

Full Python code