Vocal-Biomarkers: Technology for Analyzing Emotions

In the individuals with the major depressive disorder, a neurophysiological change that emphasizes the actions, feelings, and thoughts associated with bodily events such as the firing of nerve cells in the brain or the release of hormones often alter motor control. Thus, this affects the mechanisms of controlling speech production and facial expression. These fluctuations are related to psychomotor retardation, a condition marked by slowed neuromotor output that is behaviorally manifested as altered coordination and timing across multiple motor-based properties.

Vocal-biomarkers is a technique that enables machines to understand human emotions by analyzing, recording and detecting raw voice intonations as people speak. Biomarkers in the voice of a person have been linked to disorders such as depression, coronary artery disease, and anxiety.

By decoding the human vocal intonations into their underlying emotions in real-time, this technology enables the applications and solutions, voice-powered devices to interact at the emotional level, just as humans do. This platform is designed to generate unprecedented persistent and passive health awareness to enable and enhance holistic patient management. It measures the features of speech but does not require the contents of the speech, therefore protecting privacy.

Vocal biomarkers are tools that use voice-input to give a clear insight into a person’s health conditions. It allows non-invasive, effective and fast diagnosis unlike blood tests, MRI/CT-scans, to detect serious conditions. Various aspects of a person’s voice can help analyze the mental and physical well-being of the person. This tool has tremendous potential to turnaround diagnostic-procedures in terms of speed, accuracy and cost-effectiveness. Moreover, the voice-based diagnosis does not expose patients to radiations and therefore are safe in all other perspectives.

Self-diagnosis is an app away

The emergence of vocal-test applications on smartphones or other voice-enabled devices as a low-cost screening option helps to identify people with heart diseases and other disorders. These user-friendly applications will enhance the possibility of self-diagnosis and help patients and their kin to know potential health risks in advance.

To understand  the dimensions of emotions in human communications, i.e., people’s moods, attitudes, and emotional characteristics – also known as personalities this technology allows devices and applications to understand not just what we type, click, say, or touch, but what we mean and how we feel.

Speech as a Biomarker

A neurological disorder leaves a fingerprint in voice and speech production. Speech signal analysis can deliver clinical information that can be used to foresee certain diseases, provide information about the neurological position of specific diseases, and assess to disease progression or the effectiveness of treatment regimens. The most common method to assess the motor speech characteristics associated with neurological disease diagnosis is the auditory perceptual method. This involves a professional hearing and analyzing the motor speech changes associated with neurological disorders processes for clinical assessment, judgments, and decisions regarding the functional change. There is a need for a fully automated assessment system that can organize, process, collect, and analyze speech from individuals with concussions and deliver meaningful insights about the effects on speech production.

Once an audio recording has been processed, the next step is to extract various acoustic features that will form the basis of the health diagnostics tool. Speech can be characterized by a variety of different features and these features can be determined for various linguistic elements such as phonemes, vowels, words, or even entire sentences.

Time measurements such as average and standard deviation of the duration of a linguistic entity, duration of pauses, etc., can be measured. To measures how quickly an individual can accurately produce a series of rapid, alternating sounds assessment, tool like Diadochokinetic rate comes in action. It is an assessment tool for assessing oral motor skills.

If the individual is experiencing speech or language problems, speech-language pathologists may measure DDK rate by asking to repeat certain sounds during a timed test. The results can help the pathologists to evaluate the severity of speech or language problems, diagnose the underlying cause, and prescribe apt treatment.

Efficiency and Accuracy

The examination of the motor speech system can provide information about the integrity of the neurological system. The more complex the words; higher is the likelihood of errors and variations in speaking of those words. At the same time, the selected words and phrases while speaking have been chosen, because they require the involvement of different parts of the mouth, tongue, and soft palate in the back of the throat as they contain two or more different syllables.

The accuracy of speech analysis (recognition, word onset detection, vocal feature extraction, etc.) degrades severely when speech is recorded in adverse acoustical environments. While building a speech-based diagnostic tool, the basic concern is the inability to extract the desired vocal features, significant errors in the vocal features, and inaccuracies in the detection of the boundaries of words and phonemes.

Many other types of data such as, speech collection, processing, and analysis pose various unique challenges that require careful design choices and evaluations to ensure that data collection will provide recordings of the highest quality possible. While speech has received an increasing amount of attention analyzes, remain to be done to validate its potential as biomarker and assessment tool. The focus in prior work has been on demonstrating the relationships between neurodevelopmental and neurodegenerative conditions and various vocal features.