Time Analysis of Speech
Technology description
Time Analysis of Speech is designed to extract fundamental information about the flow of dialogue in stereo recordings, providing users with conversation characteristics such as:
- Long Reaction Time Identification: Time Analysis can pinpoint the longest delay in individual speakers' reactions in the conversation. This feature is valuable for identifying areas for improvement in conversational efficiency.
- Crosstalk Detection: Time Analysis can detect crosstalks, meaning the moments when speakers in different channels talk simultaneously, potentially complicating communication. By analyzing crosstalks, users can better anticipate communication breakdowns and take proactive measures to mitigate them.
- Speech Rate Measurement: Time Analysis measures the rate of speech in terms of phonemes per second. Phonemes are the smallest units of sound that distinguish one word from another in a particular language. This measurement provides fundamental insight into the pace of the conversation.
Input
The primary application of Time Analysis is in analyzing recordings of two-channel phone calls, where one channel captures the voice of the operator, and the other records the voice of the caller. In this setup, Time Analysis can extract useful information about the conversation dynamics.
Output
The JSON output of Time Analysis contains two main fields: channel_analyses
with information about individual channels, and reaction_analyses
with
information about combinations of channels. Let's have a closer look at each
field.
Channel analysis
In the channel_analyses
section of the result, you can find detailed
statistics for each channel showing several key characteristics of the speech
activity of the speakers:
- Speech Duration: Net speech duration refers to the portion of the audio that contains speech, excluding any pauses, hesitations, or non-verbal sounds.
- Speech Rate: The average speech rate (measured in phonemes per second) provides a basic insight into the pace of the conversation.
- Total Duration: The overall length of the recording gives a basic idea of the temporal scope of the content.
Reaction analysis
In the reaction_analyses
section of the result, you can find valuable insights
into the interactions between speakers in terms of turn-taking and crosstalks
within the conversation: :
- Reactions Count: The number of reactions from this channel to the other channel. A "reaction" occurs when a this channe starts speaking immediately after a speaker in the other channel has stopped.
- Average Reaction Time: The average duration between when the speaker in the other channel stops speaking and when the speaker in the reacting channel begins speaking.
- Slowest Reaction Position: Position of this channel's slowest reaction (longest reaction time).
- Fastest Reaction Position: Position of this channel's fastest reaction (shortest reaction time).
- Crosstalks: List of positions of this channel's crosstalks.
More information
For more information on how to use Time Analysis, you can read this detailed guide