Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Survey of Automatic Speech Recognition for Dysarthric Speech

View through CrossRef
Dysarthric speech has several pathological characteristics, such as discontinuous pronunciation, uncontrolled volume, slow speech, explosive pronunciation, improper pauses, excessive nasal sounds, and air-flow noise during pronunciation, which differ from healthy speech. Automatic speech recognition (ASR) can be very helpful for speakers with dysarthria. Our research aims to provide a scoping review of ASR for dysarthric speech, covering papers in this field from 1990 to 2022. Our survey found that the development of research studies about the acoustic features and acoustic models of dysarthric speech is nearly synchronous. During the 2010s, deep learning technologies were widely applied to improve the performance of ASR systems. In the era of deep learning, many advanced methods (such as convolutional neural networks, deep neural networks, and recurrent neural networks) are being applied to design acoustic models and lexical and language models for dysarthric-speech-recognition tasks. Deep learning methods are also used to extract acoustic features from dysarthric speech. Additionally, this scoping review found that speaker-dependent problems seriously limit the generalization applicability of the acoustic model. The scarce available speech data cannot satisfy the amount required to train models using big data.
Title: A Survey of Automatic Speech Recognition for Dysarthric Speech
Description:
Dysarthric speech has several pathological characteristics, such as discontinuous pronunciation, uncontrolled volume, slow speech, explosive pronunciation, improper pauses, excessive nasal sounds, and air-flow noise during pronunciation, which differ from healthy speech.
Automatic speech recognition (ASR) can be very helpful for speakers with dysarthria.
Our research aims to provide a scoping review of ASR for dysarthric speech, covering papers in this field from 1990 to 2022.
Our survey found that the development of research studies about the acoustic features and acoustic models of dysarthric speech is nearly synchronous.
During the 2010s, deep learning technologies were widely applied to improve the performance of ASR systems.
In the era of deep learning, many advanced methods (such as convolutional neural networks, deep neural networks, and recurrent neural networks) are being applied to design acoustic models and lexical and language models for dysarthric-speech-recognition tasks.
Deep learning methods are also used to extract acoustic features from dysarthric speech.
Additionally, this scoping review found that speaker-dependent problems seriously limit the generalization applicability of the acoustic model.
The scarce available speech data cannot satisfy the amount required to train models using big data.

Related Results

Enhancing dysarthric speech recognition through SepFormer and hierarchical attention network models with multistage transfer learning
Enhancing dysarthric speech recognition through SepFormer and hierarchical attention network models with multistage transfer learning
AbstractDysarthria, a motor speech disorder that impacts articulation and speech clarity, presents significant challenges for Automatic Speech Recognition (ASR) systems. This study...
Recent Advances in Dysarthric Speech Recognition: Approaches and Datasets
Recent Advances in Dysarthric Speech Recognition: Approaches and Datasets
Dysarthria is a neuromotor speech disorder that results from physical disability and limits speech intelligibility. Dysarthric speakers can make use of speech recognition systems t...
A Comprehensive Survey of Automatic Dysarthric Speech Recognition
A Comprehensive Survey of Automatic Dysarthric Speech Recognition
Automatic dysarthric speech recognition (DSR) is very crucial for many human computer interaction systems that enables the human to interact with machine in natural way. The object...
AFM signal model for dysarthric speech classification using speech biomarkers
AFM signal model for dysarthric speech classification using speech biomarkers
Neurological disorders include various conditions affecting the brain, spinal cord, and nervous system which results in reduced performance in different organs and muscles througho...
Automatic speech recognition in voice-speech rehabilitation effectiveness evaluation in patients after laryngectomy
Automatic speech recognition in voice-speech rehabilitation effectiveness evaluation in patients after laryngectomy
Introduction. Lost voice function compensation determines the personal and social life of laryngectomees. Automatic speech recognition and synthesis methods are...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
Comparative Analysis of Deep Learning Models for Dysarthric Speech Detection
Comparative Analysis of Deep Learning Models for Dysarthric Speech Detection
Abstract Dysarthria is a speech communication disorder that is associated with neurological impairments. In order to detect this disorder from speech, we present an experim...
Identifying Links Between Latent Memory and Speech Recognition Factors
Identifying Links Between Latent Memory and Speech Recognition Factors
Objectives: The link between memory ability and speech recognition accuracy is often examined by correlating summary measures of performance across various tasks, but i...

Back to Top