Authenticity Verification
Please note that Authenticity Verification is an experimental feature.
It is under development and may change in the future.
This guide demonstrates how to perform Authenticity Verification with Phonexia Speech Platform 4.
Authenticity Verification technology enables you to determine if audio data is genuine. While the technology is designed to eventually offer various methods for verifying audio authenticity, the only currently available operation is Deepfake Detection, which is enabled by default.
In the guide, we'll be using the following audio files. You can download them all together in the audio_files.zip archive.
filename | deepfake |
---|---|
Bridget.wav | no |
Graham.wav | yes |
At the end of this guide, you'll find the full Python code example that combines all the steps that will first be discussed separately. This guide should give you a comprehensive understanding of how to integrate Authenticity Verification into your own projects.
Prerequisites
In the guide, we assume that the Virtual Appliance is running on port 8000
of
http://localhost
. For more information on how to install and start the Virtual
Appliance, please refer to the
Virtual Appliance Installation guide.
The technology requires a proper model and license in order to process any
files. For more details on models and licenses see the
Licensing
section.
Environment Setup
We are using Python 3.9
and Python library requests 2.27
in this example.
You can install the requests
library with pip
as follows:
pip install requests~=2.27
Basic Authenticity Verification – Deepfake Detection
Note that all example results for Deepfake Detection are acquired by the
beta:2.0.0
model. The results may change in the future releases.
To run Deepfake Detection for a single media file, you should start by sending a
POST
request to the
/api/technology/experimental/authenticity-verification
endpoint.
In Python, you can do this as follows:
import requests
SPEECH_PLATFORM_SERVER = "http://localhost:8000" # Replace with your actual server URL
ENDPOINT_URL = f"{SPEECH_PLATFORM_SERVER}/api/technology/experimental/authenticity-verification"
audio_path = "Bridget.wav"
with open(audio_path, mode="rb") as file:
files = {"file": file}
response = requests.post(
url=ENDPOINT_URL,
files=files,
)
print(response.status_code) # Should print '202'
If the task has been successfully accepted, the 202
code will be returned
together with a unique task ID
in the response body. The task isn't processed
immediately, but only scheduled for processing. You can check the current task
status by polling for the result.
The URL for polling the result is returned in the X-Location
header.
Alternatively, you can assemble the polling URL on your own by appending a slash
(/
) and the task ID
to the endpoint URL.
import json
import requests
import time
# Use the `response` from the previous step
polling_url = response.headers["x-location"]
# Alternatively:
# polling_url = ENDPOINT_URL + "/" + response.json()["task"]["task_id"]
while True:
response = requests.get(polling_url)
data = response.json()
task_status = data["task"]["state"]
if task_status in {"done", "failed", "rejected"}:
break
time.sleep(5)
print(json.dumps(data, indent=2))
Once the polling finishes, data
will contain the latest response from the
server – either the result of Authenticity Verification, or an error message
with details, in case processing was not able to finish properly.
For deepfake_detection
, the resulting score has a decision threshold of 0.
Deepfake files should have a score higher than 0, while genuine files should
have a score lower than 0. The technology result for deepfake_detection
can be
accessed as data["result"]["deepfake_detection"]
, and for our sample audio,
data
should look as follows:
{
"task": {
"task_id": "330f9d36-04e2-4b78-b4da-79bdd61aa7db",
"state": "done"
},
"result": {
"deepfake_detection": {
"channels": [
{
"channel_number": 0,
"score": -1.434390902519226
}
]
}
}
}
When processing multichannel media files, you will receive an independent
Deepfake Detection operation result for each channel within the channels
list.
Full Python code
Here is the full example of how to run the Deepfake Detection operation with Authenticity Verification technology. The code is slightly adjusted and wrapped into functions for better readability.
import json
import requests
import time
SPEECH_PLATFORM_SERVER = "http://localhost:8000" # Replace with your actual server URL
ENDPOINT_URL = f"{SPEECH_PLATFORM_SERVER}/api/technology/experimental/authenticity-verification"
def poll_result(polling_url: str, sleep: int = 5):
while True:
response = requests.get(polling_url)
response.raise_for_status()
data = response.json()
task_status = data["task"]["state"]
if task_status in {"done", "failed", "rejected"}:
break
time.sleep(sleep)
return response
def run_authenticity_verification(audio_path: str):
with open(audio_path, mode="rb") as file:
files = {"file": file}
response = requests.post(
url=ENDPOINT_URL,
files=files,
)
response.raise_for_status()
polling_url = response.headers["x-location"]
response_result = poll_result(polling_url)
return response_result.json()
filenames = ["Bridget.wav", "Graham.wav"]
for filename in filenames:
print(f"Running Authenticity Verification for file {filename}.")
data = run_authenticity_verification(filename)
result = data["result"]
print(json.dumps(result, indent=2))