Skip to main content
Version: 3.1.0

Overview

Integrated as part of the virtual appliance, Phonexia offers a distinctive web application designed for demonstrating the capabilities of Speech Platform 4 and catering to users who favor a simple, straightforward approach when working with small amounts of audio data.

Its clear and intuitive design guides users seamlessly through the speech technology features without confusion. It is responsive and adaptable to different screen sizes including mobile devices to maintain usability across various platforms. Furthermore, at Phonexia we conduct continuous usability testing with target users to gather feedback and insights on the effectiveness and user-friendliness of the web application design, iterating and refining the interface based on real-world usage scenarios of our technologies.

Current Technologies

At the moment, Phonexia Speech Platform 4 offers two powerful technologies that can be used in the web application:

Speaker Identification

Phonexia Speaker Identification uses deep neural networks to create highly accurate mathematical models of the human voice (voiceprints) and provides rapid, highly accurate voice comparison for any scenario, from individual 1:1 voice verification to complex 1:N and N:M speaker identification.

The comparison of two recordings (or groups of recordings) occurs within two adjacent panes, where uploaded and processed audios are juxtaposed for comparison. Audios displaying a high similarity score (number in green) indicate a match with the voice in the selected recording in the neighboring pane.

You can learn more about Speaker Identification in our Technologies section.

Speaker identification

Speech to Text

Phonexia Speech to Text technology offers cutting-edge deep neural network models with large open-source models, providing an extensive transcription range of over 60 languages as well as automatic language detection. It incorporates state-ot-the-art channel compensation techniques, ensuring compatibility with a broad spectrum of audio sources, including GSM/CDMA, 3G, VoIP, landlines, and satellite phones.

In our web application, users are empowered to effortlessly upload audio files featuring speech, with the option to specify their desired language or rely on automatic detection, streamlining the transcription process.

You can learn more about Phonexia 6th Gen Speech to Text and Enhanced Speech to Text Built on Whisper in our Technologies section.

Speech to Text

What's Next to Come?

At Phonexia we are continuously enhancing our Speech Platform 4 portfolio by integrating the latest innovations. Users can expect the introduction of the following technologies:

  • Language Identification
  • Speaker Diarization
  • Keyword Spotting
  • Gender Identification
  • Age Estimation
  • Speech Quality Estimation