Javascript must be enabled to continue!
Comparative Analysis of Deep Learning Models for Dysarthric Speech Detection
View through CrossRef
Abstract
Dysarthria is a speech communication disorder that is associated with neurological impairments. In order to detect this disorder from speech, we present an experimental comparison of deep models developed based on frequency domain features. A comparative analysis of deep models is performed in the detection of dysarthria using scalogram of Dysarthric Speech. Also, it can assist physicians, specialists, and doctors based on the results of its detection. Since Dysarthric speech signals have segments of breathy and semi-whispery, experiments are performed only on the frequency domain representation of speech signals. Time domain speech signal is transformed into a 2-D scalogram image through wavelet transformation. Then, the scalogram images are applied to pre-trained convolutional neural networks. The layers of pre-trained networks are tuned for our scalogram images through transfer learning. The proposed method of applying the scalogram images as input to pre-trained CNNs is evaluated on the TORGO database and the classification performance of these networks is compared. In this work, AlexNet, GoogLeNet, and ResNet 50 are considered deep models of pre-trained convolutional neural networks. The proposed method of using pre-trained and transfer learned CNN with scalogram image feature achieved better accuracy when compared to other machine learning models in the dysarthria detection system.
Title: Comparative Analysis of Deep Learning Models for Dysarthric Speech Detection
Description:
Abstract
Dysarthria is a speech communication disorder that is associated with neurological impairments.
In order to detect this disorder from speech, we present an experimental comparison of deep models developed based on frequency domain features.
A comparative analysis of deep models is performed in the detection of dysarthria using scalogram of Dysarthric Speech.
Also, it can assist physicians, specialists, and doctors based on the results of its detection.
Since Dysarthric speech signals have segments of breathy and semi-whispery, experiments are performed only on the frequency domain representation of speech signals.
Time domain speech signal is transformed into a 2-D scalogram image through wavelet transformation.
Then, the scalogram images are applied to pre-trained convolutional neural networks.
The layers of pre-trained networks are tuned for our scalogram images through transfer learning.
The proposed method of applying the scalogram images as input to pre-trained CNNs is evaluated on the TORGO database and the classification performance of these networks is compared.
In this work, AlexNet, GoogLeNet, and ResNet 50 are considered deep models of pre-trained convolutional neural networks.
The proposed method of using pre-trained and transfer learned CNN with scalogram image feature achieved better accuracy when compared to other machine learning models in the dysarthria detection system.
Related Results
A Survey of Automatic Speech Recognition for Dysarthric Speech
A Survey of Automatic Speech Recognition for Dysarthric Speech
Dysarthric speech has several pathological characteristics, such as discontinuous pronunciation, uncontrolled volume, slow speech, explosive pronunciation, improper pauses, excessi...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Enhancing dysarthric speech recognition through SepFormer and hierarchical attention network models with multistage transfer learning
Enhancing dysarthric speech recognition through SepFormer and hierarchical attention network models with multistage transfer learning
AbstractDysarthria, a motor speech disorder that impacts articulation and speech clarity, presents significant challenges for Automatic Speech Recognition (ASR) systems. This study...
Recent Advances in Dysarthric Speech Recognition: Approaches and Datasets
Recent Advances in Dysarthric Speech Recognition: Approaches and Datasets
Dysarthria is a neuromotor speech disorder that results from physical disability and limits speech intelligibility. Dysarthric speakers can make use of speech recognition systems t...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
AFM signal model for dysarthric speech classification using speech biomarkers
AFM signal model for dysarthric speech classification using speech biomarkers
Neurological disorders include various conditions affecting the brain, spinal cord, and nervous system which results in reduced performance in different organs and muscles througho...
A Comprehensive Survey of Automatic Dysarthric Speech Recognition
A Comprehensive Survey of Automatic Dysarthric Speech Recognition
Automatic dysarthric speech recognition (DSR) is very crucial for many human computer interaction systems that enables the human to interact with machine in natural way. The object...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...

