Javascript must be enabled to continue!

Multimodal Approach in Prediction of Alzheimer’s Disease Using Voice, Transcript Dataset

Introduction Alzheimer’s disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline , memory impairment and impact on language abilities. Early and accurate prediction of AD is critical for effective intervention and management. This study proposes a multimodal approach that integrates heterogeneous data sources—including voice recordings, transcribed speech, textual metadata, and neuroimaging—to enhance prediction accuracy Objectives: The primary objective of this study is to develop and evaluate a multimodal machine learning framework that combines acoustic features from voice recordings and linguistic features from speech transcripts to improve the accuracy and reliability of early Alzheimer’s disease prediction. By utilizing both speech and textual data, the approach aims to capture subtle cognitive and behavioral patterns that may not be evident through a single modality, ultimately contributing to earlier, non-invasive, and scalable diagnostic tools.. Methods: This study proposes a multimodal approach that integrates heterogeneous data sources—including voice recordings, transcribed speech, textual metadata, and neuroimaging—to enhance prediction accuracy. By leveraging the complementary strengths of each modality, the system captures both linguistic and paralinguistic features from speech, semantic and syntactic cues from transcripts, and structural biomarkers from MRI scans. Results: Experimental results on benchmark datasets demonstrate that the multimodal fusion approach significantly outperforms unimodal baselines, offering a more robust and holistic understanding of AD-related indicators.both train ans test accuracy are almost same. Both the dataset have good accuracy more than 70% accuracy is achieved. This approach underscores the potential of multimodal machine learning in advancing non-invasive, early-stage Alzheimer’s diagnosis. Conclusions: This study demonstrates the effectiveness of a multimodal approach that integrates both voice and transcript data for the prediction of Alzheimer's disease. By leveraging linguistic features from transcripts alongside acoustic features from voice recordings, the model achieves a more comprehensive understanding of cognitive decline indicators. The fusion of these modalities not only improves prediction accuracy but also provides a non-invasive, cost-effective alternative for early detection. Future work may focus on expanding datasets, incorporating other modalities (e.g., facial expressions or brain imaging), and enhancing real-time clinical applicability to support early intervention and better patient outcomes. Keywords: lorem ipsum.

Science Research Society

Manisha Chaudhari

Journal of Information Systems Engineering and Management

2025

Title: Multimodal Approach in Prediction of Alzheimer’s Disease Using Voice, Transcript Dataset

Description:

Introduction Alzheimer’s disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline , memory impairment and impact on language abilities.

Early and accurate prediction of AD is critical for effective intervention and management.

This study proposes a multimodal approach that integrates heterogeneous data sources—including voice recordings, transcribed speech, textual metadata, and neuroimaging—to enhance prediction accuracy Objectives: The primary objective of this study is to develop and evaluate a multimodal machine learning framework that combines acoustic features from voice recordings and linguistic features from speech transcripts to improve the accuracy and reliability of early Alzheimer’s disease prediction.

By utilizing both speech and textual data, the approach aims to capture subtle cognitive and behavioral patterns that may not be evident through a single modality, ultimately contributing to earlier, non-invasive, and scalable diagnostic tools.

Methods: This study proposes a multimodal approach that integrates heterogeneous data sources—including voice recordings, transcribed speech, textual metadata, and neuroimaging—to enhance prediction accuracy.

By leveraging the complementary strengths of each modality, the system captures both linguistic and paralinguistic features from speech, semantic and syntactic cues from transcripts, and structural biomarkers from MRI scans.

Results: Experimental results on benchmark datasets demonstrate that the multimodal fusion approach significantly outperforms unimodal baselines, offering a more robust and holistic understanding of AD-related indicators.

both train ans test accuracy are almost same.

Both the dataset have good accuracy more than 70% accuracy is achieved.

This approach underscores the potential of multimodal machine learning in advancing non-invasive, early-stage Alzheimer’s diagnosis.

Conclusions: This study demonstrates the effectiveness of a multimodal approach that integrates both voice and transcript data for the prediction of Alzheimer's disease.

By leveraging linguistic features from transcripts alongside acoustic features from voice recordings, the model achieves a more comprehensive understanding of cognitive decline indicators.

The fusion of these modalities not only improves prediction accuracy but also provides a non-invasive, cost-effective alternative for early detection.

Future work may focus on expanding datasets, incorporating other modalities (e.

, facial expressions or brain imaging), and enhancing real-time clinical applicability to support early intervention and better patient outcomes.

Keywords: lorem ipsum.

Back

Abstract— Alzheimer's disease is a neurodegenerative disease that develops gradually, and is associated with cardiovascular and cerebrovascular problems. Alzheimer's is a serious d...

Speech, communication, and neuroimaging in Parkinson's disease : characterisation and intervention outcomes

<p dir="ltr">Most individuals with Parkinson's disease (PD) experience changes in speech, voice or communication. Speech changes often manifest as hypokinetic dysarthria, a m...

Speech, communication, and neuroimaging in Parkinson's disease : characterisation and intervention outcomes

<p dir="ltr">Most individuals with Parkinson's disease (PD) experience changes in speech, voice or communication. Speech changes often manifest as hypokinetic dysarthria, a m...

Speech, communication, and neuroimaging in Parkinson's disease : Characterisation and intervention outcomes

<p dir="ltr">Most individuals with Parkinson's disease (PD) experience changes in speech, voice or communication. Speech changes often manifest as hypokinetic dysarthria, a m...

ATN status in amnestic and non-amnestic Alzheimer’s disease and frontotemporal lobar degeneration

AbstractUnder the ATN framework, cerebrospinal fluid analytes provide evidence of the presence or absence of Alzheimer’s disease pathological hallmarks: amyloid plaques (A), phosph...

Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal

Abstract: This paper addresses the issue of multimodal re-performance, a concept developed by us, in view of the fact that the famous song “Imagine”, by John Lennon, was published ...

Brain mechanism of unfamiliar and familiar voice processing: an activation likelihood estimation meta-analysis

Interpersonal communication through vocal information is very important for human society. During verbal interactions, our vocal cord vibrations convey important information regard...

How to speak and vocal hygiene

An abnormal tongue shape, pitch difference or voice quality can lead to difficulty communicating effectively. Common among teachers are voice issues, which can be uncomfortable and...

Email:
Password:

Email:

Multimodal Approach in Prediction of Alzheimer’s Disease Using Voice, Transcript Dataset

Related Results