Deepfake Detection
Phonexia has developed a Deepfake Detection technology designed to identify artificial voices within audio recordings, thereby enhancing the security and reliability of speaker verification systems. This approach leverages the transformed-based architecture and is primarily trained on datasets, which encompasses a wide range of synthesized, converted, and replayed speech examples.
The model is trained in a self-supervised manner, and its performance is improved through carefully designed data augmentation techniques. It has been trained on a large corpus of various data source, including telephone data, resulting in fewer false alarms on such recordings. The model requires a minimum of 3 seconds of speech for inference however, the best performance is achieved on lengths of 5 seconds and above.
Possible use cases
- Banks and Call Centers: Enhances the security of customer interactions by ensuring that communications are with legitimate individuals, thereby preventing fraudulent activities and unauthorized access.
- Forensic Analysis: Assists law enforcement agencies in authenticating audio evidence, ensuring its credibility in investigations and legal proceedings.
Scoring
The score is LLR (log-likelihood ratio). A value greater than 0 indicates the audio is likely a deepfake, while a value less than 0 suggests the audio is likely genuine.
It is calibrated so that 0 corresponds to the Equal Error Rate (EER) point on our evaluation datasets. The EER is the point where the false acceptance rate and false rejection rate are equal, providing a balanced trade-off between the two types of errors.
Depending on the characteristics of your data and specific use case, you may need to adjust the decision threshold to achieve the desired ratio between false positives and false negatives.
Output Range
- In the graphical user interface (GUI), the LLR output is displayed within a range of -2 to +6, reflecting the most common interval of values observed.
- In the REST API, the score is returned as an unbounded LLR, theoretically ranging from minus infinity to plus infinity. However, in practice, values typically fall within the range of -2 to +6.
FAQ
How can I improve processing speed?
To speed up processing, ensure that Authenticity Verification is running on a GPU.