Javascript must be enabled to continue!
Challenges of Clustering Multimodal Clinical Data: Review of Applications in Asthma Subtyping (Preprint)
View through CrossRef
BACKGROUND
In the current era of personalized medicine, there is increasing interest in understanding the heterogeneity in disease populations. Cluster analysis is a method commonly used to identify subtypes in heterogeneous disease populations. The clinical data used in such applications are typically multimodal, which can make the application of traditional cluster analysis methods challenging.
OBJECTIVE
This study aimed to review the research literature on the application of clustering multimodal clinical data to identify asthma subtypes. We assessed common problems and shortcomings in the application of cluster analysis methods in determining asthma subtypes, such that they can be brought to the attention of the research community and avoided in future studies.
METHODS
We searched PubMed and Scopus bibliographic databases with terms related to cluster analysis and asthma to identify studies that applied dissimilarity-based cluster analysis methods. We recorded the analytic methods used in each study at each step of the cluster analysis process.
RESULTS
Our literature search identified 63 studies that applied cluster analysis to multimodal clinical data to identify asthma subtypes. The features fed into the cluster algorithms were of a mixed type in 47 (75%) studies and continuous in 12 (19%), and the feature type was unclear in the remaining 4 (6%) studies. A total of 23 (37%) studies used hierarchical clustering with Ward linkage, and 22 (35%) studies used k-means clustering. Of these 45 studies, 39 had mixed-type features, but only 5 specified dissimilarity measures that could handle mixed-type features. A further 9 (14%) studies used a preclustering step to create small clusters to feed on a hierarchical method. The original sample sizes in these 9 studies ranged from 84 to 349. The remaining studies used hierarchical clustering with other linkages (n=3), medoid-based methods (n=3), spectral clustering (n=1), and multiple kernel k-means clustering (n=1), and in 1 study, the methods were unclear. Of 63 studies, 54 (86%) explained the methods used to determine the number of clusters, 24 (38%) studies tested the quality of their cluster solution, and 11 (17%) studies tested the stability of their solution. Reporting of the cluster analysis was generally poor in terms of the methods employed and their justification.
CONCLUSIONS
This review highlights common issues in the application of cluster analysis to multimodal clinical data to identify asthma subtypes. Some of these issues were related to the multimodal nature of the data, but many were more general issues in the application of cluster analysis. Although cluster analysis may be a useful tool for investigating disease subtypes, we recommend that future studies carefully consider the implications of clustering multimodal data, the cluster analysis process itself, and the reporting of methods to facilitate replication and interpretation of findings.
Title: Challenges of Clustering Multimodal Clinical Data: Review of Applications in Asthma Subtyping (Preprint)
Description:
BACKGROUND
In the current era of personalized medicine, there is increasing interest in understanding the heterogeneity in disease populations.
Cluster analysis is a method commonly used to identify subtypes in heterogeneous disease populations.
The clinical data used in such applications are typically multimodal, which can make the application of traditional cluster analysis methods challenging.
OBJECTIVE
This study aimed to review the research literature on the application of clustering multimodal clinical data to identify asthma subtypes.
We assessed common problems and shortcomings in the application of cluster analysis methods in determining asthma subtypes, such that they can be brought to the attention of the research community and avoided in future studies.
METHODS
We searched PubMed and Scopus bibliographic databases with terms related to cluster analysis and asthma to identify studies that applied dissimilarity-based cluster analysis methods.
We recorded the analytic methods used in each study at each step of the cluster analysis process.
RESULTS
Our literature search identified 63 studies that applied cluster analysis to multimodal clinical data to identify asthma subtypes.
The features fed into the cluster algorithms were of a mixed type in 47 (75%) studies and continuous in 12 (19%), and the feature type was unclear in the remaining 4 (6%) studies.
A total of 23 (37%) studies used hierarchical clustering with Ward linkage, and 22 (35%) studies used k-means clustering.
Of these 45 studies, 39 had mixed-type features, but only 5 specified dissimilarity measures that could handle mixed-type features.
A further 9 (14%) studies used a preclustering step to create small clusters to feed on a hierarchical method.
The original sample sizes in these 9 studies ranged from 84 to 349.
The remaining studies used hierarchical clustering with other linkages (n=3), medoid-based methods (n=3), spectral clustering (n=1), and multiple kernel k-means clustering (n=1), and in 1 study, the methods were unclear.
Of 63 studies, 54 (86%) explained the methods used to determine the number of clusters, 24 (38%) studies tested the quality of their cluster solution, and 11 (17%) studies tested the stability of their solution.
Reporting of the cluster analysis was generally poor in terms of the methods employed and their justification.
CONCLUSIONS
This review highlights common issues in the application of cluster analysis to multimodal clinical data to identify asthma subtypes.
Some of these issues were related to the multimodal nature of the data, but many were more general issues in the application of cluster analysis.
Although cluster analysis may be a useful tool for investigating disease subtypes, we recommend that future studies carefully consider the implications of clustering multimodal data, the cluster analysis process itself, and the reporting of methods to facilitate replication and interpretation of findings.
Related Results
Biomarker profiles and immune cell populations in distinct asthma endotypes
Biomarker profiles and immune cell populations in distinct asthma endotypes
<p dir="ltr">Asthma affects 260 million individuals globally and imposes a substantial health burden. Its hallmarks include chronic airway inflammation, airway hyperresponsiv...
Biomarker profiles and immune cell populations in distinct asthma endotypes
Biomarker profiles and immune cell populations in distinct asthma endotypes
<p dir="ltr">Asthma affects 260 million individuals globally and imposes a substantial health burden. Its hallmarks include chronic airway inflammation, airway hyperresponsiv...
The Impact of Adverse Childhood Experiences on Asthma Severity in US Adults
The Impact of Adverse Childhood Experiences on Asthma Severity in US Adults
Background/objectives: The association between adverse childhood experiences (ACEs) and asthma severity among United States (US) adults with asthma has not been well documented. In...
ASTHMA AND RESPIRATORY SYMPTOMS RELATED TO THE ENVIRONMENT
ASTHMA AND RESPIRATORY SYMPTOMS RELATED TO THE ENVIRONMENT
Asthma, a ubiquitous chronic respiratory ailment, stands as a formidable global health concern, affecting millions of individuals across the world. This widespread condition, marke...
Prevalence of Comorbidities among United States Adults with asthma and Their Association with Asthma Severity
Prevalence of Comorbidities among United States Adults with asthma and Their Association with Asthma Severity
Abstract
Introduction
The burden of comorbidities in asthma patients significantly affects management strategies and outcomes. ...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND
Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
Magnitude and Factors Associated With Uncontrolled Asthma Among Patients at Government Hospitals Follow-Up Clinic in Harar and Dire Dawa, Eastern Ethiopia.
Magnitude and Factors Associated With Uncontrolled Asthma Among Patients at Government Hospitals Follow-Up Clinic in Harar and Dire Dawa, Eastern Ethiopia.
Abstract
Background: Uncontrolled asthma adds to the burden of non-communicable diseases. The studies on the level of asthma control in Ethiopia are confined to some specif...
Toll-like receptor 4 (TLR-4) polymorphisms and asthma risk in rural and urban settings: findings from the UK biobank
Toll-like receptor 4 (TLR-4) polymorphisms and asthma risk in rural and urban settings: findings from the UK biobank
Introduction and aim: The risk of asthma and its phenotypes may be modified by gene-environmental interactions. The previous studies on the interactions between genetic variations ...

