Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Integrated Bioinformatics and Ensemble Learning Reveal Diagnostic Modeling and Drug Discovery in Alzheimer’s Disease

View through CrossRef
Abstract Background: Alzheimer’s disease (AD) is driven by complex molecular and immune dysregulation, yet reliable diagnostic biomarkers and druggable targets remain limited. This study aimed to identify key AD-associated regulatory genes, characterize their immune and spatial expression features, and prioritize small-molecule compounds with therapeutic potential. Methods: Multiple AD-related transcriptomic datasets—including bulk RNA-seq, microarray, and spatial transcriptomic profiles—were retrieved from GEO and systematically partitioned into discovery (GSE5281, GSE66333), validation (GSE110226, GSE28146, GSE29378), independent testing (GSE29378), and spatial validation cohorts (GSE147047). Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to construct co-expression networks and define AD-associated gene modules. Protein–protein interaction (PPI) analysis and multiple network centrality measures were then applied to prioritize candidate key genes. Twelve machine-learning algorithms were combined into 127 classification models, and SHAP-based interpretability analysis was used to quantify feature contributions and identify diagnostic genes. Single-cell and spatial transcriptomic data were further used to validate the cell type specificity and spatial localization of the hub genes. Drug–gene enrichment analysis (DSigDB), compound retrieval (PubChem), ADMET and drug-likeness profiling, and molecular blind docking were integrated to screen and evaluate potential lead compounds. Results: We identified 2,534 differentially expressed genes (DEGs) between AD and control samples, and their intersection with WGCNA-derived modules yielded 848 candidate genes. PPI-based network analysis prioritized 15 key genes, on which 127 machine-learning models were constructed; the random forest model achieved the best overall performance with an average AUC of 0.957. SHAP analysis identified 11 key diagnostic genes, among which IGF1R and SPP1 emerged as stable hub genes with AUCs greater than 0.70 across multiple external cohorts. Immune infiltration, single-cell, and spatial transcriptomic analyses demonstrated distinct immune associations and cell type– and region-specific expression patterns of these hub genes. Drug–gene enrichment identified 176 drug signatures and 445 related compounds, of which 37 grade-A molecules remained after ADMET and drug-likeness filtering. Molecular docking revealed four top-ranked compounds with binding energies better than −9.0 kcal/mol, including one ligand with a minimum binding energy of −10.5 kcal/mol and extensive non-covalent interactions with the target protein. Conclusion: A systematic methodological framework from gene discovery and diagnostic modeling to lead drug screening was developed in this study. IGF1R and SPP1 were identified as stable and biologically interpretable AD hub genes, which can be used as potential diagnostic markers, and various high-affinity small molecule compounds based on the hub genes provide new drug candidates for targeted AD therap.
Springer Science and Business Media LLC
Title: Integrated Bioinformatics and Ensemble Learning Reveal Diagnostic Modeling and Drug Discovery in Alzheimer’s Disease
Description:
Abstract Background: Alzheimer’s disease (AD) is driven by complex molecular and immune dysregulation, yet reliable diagnostic biomarkers and druggable targets remain limited.
This study aimed to identify key AD-associated regulatory genes, characterize their immune and spatial expression features, and prioritize small-molecule compounds with therapeutic potential.
Methods: Multiple AD-related transcriptomic datasets—including bulk RNA-seq, microarray, and spatial transcriptomic profiles—were retrieved from GEO and systematically partitioned into discovery (GSE5281, GSE66333), validation (GSE110226, GSE28146, GSE29378), independent testing (GSE29378), and spatial validation cohorts (GSE147047).
Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to construct co-expression networks and define AD-associated gene modules.
Protein–protein interaction (PPI) analysis and multiple network centrality measures were then applied to prioritize candidate key genes.
Twelve machine-learning algorithms were combined into 127 classification models, and SHAP-based interpretability analysis was used to quantify feature contributions and identify diagnostic genes.
Single-cell and spatial transcriptomic data were further used to validate the cell type specificity and spatial localization of the hub genes.
Drug–gene enrichment analysis (DSigDB), compound retrieval (PubChem), ADMET and drug-likeness profiling, and molecular blind docking were integrated to screen and evaluate potential lead compounds.
Results: We identified 2,534 differentially expressed genes (DEGs) between AD and control samples, and their intersection with WGCNA-derived modules yielded 848 candidate genes.
PPI-based network analysis prioritized 15 key genes, on which 127 machine-learning models were constructed; the random forest model achieved the best overall performance with an average AUC of 0.
957.
SHAP analysis identified 11 key diagnostic genes, among which IGF1R and SPP1 emerged as stable hub genes with AUCs greater than 0.
70 across multiple external cohorts.
Immune infiltration, single-cell, and spatial transcriptomic analyses demonstrated distinct immune associations and cell type– and region-specific expression patterns of these hub genes.
Drug–gene enrichment identified 176 drug signatures and 445 related compounds, of which 37 grade-A molecules remained after ADMET and drug-likeness filtering.
Molecular docking revealed four top-ranked compounds with binding energies better than −9.
0 kcal/mol, including one ligand with a minimum binding energy of −10.
5 kcal/mol and extensive non-covalent interactions with the target protein.
Conclusion: A systematic methodological framework from gene discovery and diagnostic modeling to lead drug screening was developed in this study.
IGF1R and SPP1 were identified as stable and biologically interpretable AD hub genes, which can be used as potential diagnostic markers, and various high-affinity small molecule compounds based on the hub genes provide new drug candidates for targeted AD therap.

Related Results

Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Penerapan Metode Convolutional Neural Network untuk Diagnosa Penyakit Alzheimer
Penerapan Metode Convolutional Neural Network untuk Diagnosa Penyakit Alzheimer
Abstract— Alzheimer's disease is a neurodegenerative disease that develops gradually, and is associated with cardiovascular and cerebrovascular problems. Alzheimer's is a serious d...
Suffering of Patients with Neurogenic Thoracic Outlet Syndrome (TOS); The First Qualitative study in TOS
Suffering of Patients with Neurogenic Thoracic Outlet Syndrome (TOS); The First Qualitative study in TOS
Abstract Background Diagnosis of neurogenic thoracic outlet syndrome (nTOS) is hindered by symptom overlap with cervical radiculopathy, carpal tunnel syndrome, or psychosomatic dis...
Advancements in Biomedical and Bioinformatics Engineering
Advancements in Biomedical and Bioinformatics Engineering
Abstract: The field of biomedical and bioinformatics engineering is witnessing rapid advancements that are revolutionizing healthcare and medical research. This chapter provides a...
ATN status in amnestic and non-amnestic Alzheimer’s disease and frontotemporal lobar degeneration
ATN status in amnestic and non-amnestic Alzheimer’s disease and frontotemporal lobar degeneration
AbstractUnder the ATN framework, cerebrospinal fluid analytes provide evidence of the presence or absence of Alzheimer’s disease pathological hallmarks: amyloid plaques (A), phosph...
Bioinformatics tool and web server development focusing on structural bioinformatics applications
Bioinformatics tool and web server development focusing on structural bioinformatics applications
This thesis is divided into two main sections: Part 1 describes the design, and evaluation of the accuracy of a new web server – PRotein Interactive MOdeling (PRIMO-Complexes) for ...
A large-scale analysis of bioinformatics code on GitHub
A large-scale analysis of bioinformatics code on GitHub
AbstractIn recent years, the explosion of genomic data and bioinformatic tools has been accompanied by a growing conversation around reproducibility of results and usability of sof...

Back to Top