Skip to main content
Version: 2026.03.0-rc1

Supported Audio Formats

This page describes the audio file formats and encodings supported by the microservices.

Quick Reference

File FormatExtensionsContainersCommon Encodings
WAV.wavRIFF, RIFX, AIFF, RF64, W64PCM, IEEE float, A-law, µ-law, ADPCM
FLAC.flacNative FLAC, Ogg/FLACPCM (16/24/32-bit)
RAWvariesNone (headerless)PCM, IEEE float, A-law, µ-law
warning

All microservices require mono audio input, with one exception: the Time Analysis microservice supports both mono and stereo audio.

Detailed Format Specifications

WAV

The WAV format is supported with multiple container variants and a wide range of sample encodings.

Supported Containers

ContainerDescription
RIFFStandard WAV format (little-endian)
RIFXBig-endian WAV variant
AIFFAudio Interchange File Format
RF64Extended WAV format for files larger than 4 GB
W64Sony Wave64 format for large files

Supported Encodings

EncodingDescription
Unsigned 8-bit PCM8-bit unsigned integer samples
Signed 12-bit PCM12-bit signed integer samples
Signed 16-bit PCM16-bit signed integer samples (CD quality)
Signed 24-bit PCM24-bit signed integer samples (studio quality)
Signed 32-bit PCM32-bit signed integer samples
IEEE 32-bit float32-bit floating-point samples
IEEE 64-bit float64-bit floating-point samples
A-lawLogarithmic compression (telephony standard)
µ-law (u-law)Logarithmic compression (North American telephony)
Microsoft ADPCMAdaptive differential pulse-code modulation
IMA ADPCMDVI ADPCM, format code 0x11
note

ADPCM encoding is not supported when using the AIFF container.

FLAC

FLAC (Free Lossless Audio Codec) provides lossless compression with full audio quality preservation.

Supported Containers

ContainerDescription
Native FLACStandard FLAC format
Ogg/FLACFLAC audio in Ogg container

Supported Encodings

EncodingDescription
Signed 16-bit PCM16-bit signed integer samples
Signed 24-bit PCM24-bit signed integer samples
Signed 32-bit PCM32-bit signed integer samples

RAW Audio Stream

For raw audio data without a container, the following formats are supported:

EncodingDescription
Signed 16-bit PCM16-bit signed integer samples
Signed 32-bit PCM32-bit signed integer samples
IEEE 32-bit float32-bit floating-point samples
A-lawLogarithmic compression
µ-law (u-law)Logarithmic compression
info

When using raw audio streams, you must specify the sample rate, sample format, and other audio parameters explicitly, as this information is not embedded in the data.

Sample Rate

There are no strict limitations on sample rate. However, for optimal speech processing performance, a sample rate of 8 kHz or higher is recommended. Common sample rates include:

  • 8 kHz – Telephony quality
  • 16 kHz – Wideband speech
  • 44.1 kHz – CD quality
  • 48 kHz – Professional audio/video