Skip to main content
Version: 3.2.0

Language Identification

Language Identification is a robust technology that enables users to detect the language spoken in an audio recording. It supports the identification of 140 languages from around the world and, for widely spoken languages, can even distinguish between regional varieties.

Supported languages

Have a look at the complete language portfolio of our Language Identification.

How it works

Creating a subset of languages

The first step is optional. By clicking "Select a subset," a new window will open where you can choose the languages or regions likely to be present in the recording. Excluding unlikely languages can help enhance the accuracy of the language identification results.

If you frequently work with a specific region and expect only a particular subset of languages to appear in your recordings, consider saving your language subset for repeated use. You can create and save as many custom subsets as needed.

warning

Your language subsets are stored locally in your browser. If you switch devices or use a different browser, these subsets will no longer be available.

Uploading files

Afterwards, you can upload your files or create your own recordings by using the built-in recording feature. If you don't have your own files, you can use the provided Phonexia examples to explore how Language Identification works.

Read more about uploading files here.

Results

After uploading, your recordings will be listed in the left panel. Once processing is complete, the results for each recording will be displayed as a bar chart in the right panel. Only the languages detected with a significant score will be shown.

tip

If you find that the identified language is incorrect after listening to the recordings, you can refine your results by creating a new subset. Click on the button next to the filename and create a subset or modify the current one by excluding the incorrect language from the list. This adjustment will help you obtain more accurate results.