Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Comparative analysis of resampling algorithms in the prediction of stroke diseases

View through CrossRef
Stroke disease is a serious cause of death globally. Early predictions of the disease will save a lot of lives but most of the clinical datasets are imbalanced in nature including the stroke dataset, making the predictive algorithms biased towards the majority class. The objective of this research is to compare different data resampling algorithms on the stroke dataset to improve the prediction performances of the machine learning models. This paper considered five (5) resampling algorithms namely; Random over Sampling (ROS), Synthetic Minority oversampling Technique (SMOTE), Adaptive Synthetic (ADASYN), hybrid techniques like SMOTE with Edited Nearest Neighbor (SMOTE-ENN), and SMOTE with Tomek Links (SMOTE-TOMEK) and trained on six (6) machine learning classifiers namely; Logistic Regression (LR), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), and XGBoost (XGB). The hybrid technique SMOTE-ENN influences the machine learning classifiers the best followed by the SMOTE technique while the combination of SMOTE and XGB perform better with an accuracy of 97.99% and G-mean score of 0.99, and auc_roc score of 0.99. Resampling algorithms balance the dataset and enhanced the predictive power of machine learning algorithms. Therefore, we recommend resampling stroke dataset in predicting stroke disease than modeling on the imbalanced dataset.
Title: Comparative analysis of resampling algorithms in the prediction of stroke diseases
Description:
Stroke disease is a serious cause of death globally.
Early predictions of the disease will save a lot of lives but most of the clinical datasets are imbalanced in nature including the stroke dataset, making the predictive algorithms biased towards the majority class.
The objective of this research is to compare different data resampling algorithms on the stroke dataset to improve the prediction performances of the machine learning models.
This paper considered five (5) resampling algorithms namely; Random over Sampling (ROS), Synthetic Minority oversampling Technique (SMOTE), Adaptive Synthetic (ADASYN), hybrid techniques like SMOTE with Edited Nearest Neighbor (SMOTE-ENN), and SMOTE with Tomek Links (SMOTE-TOMEK) and trained on six (6) machine learning classifiers namely; Logistic Regression (LR), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), and XGBoost (XGB).
The hybrid technique SMOTE-ENN influences the machine learning classifiers the best followed by the SMOTE technique while the combination of SMOTE and XGB perform better with an accuracy of 97.
99% and G-mean score of 0.
99, and auc_roc score of 0.
99.
Resampling algorithms balance the dataset and enhanced the predictive power of machine learning algorithms.
Therefore, we recommend resampling stroke dataset in predicting stroke disease than modeling on the imbalanced dataset.

Related Results

Iranian stroke model-how to involve health policymakers
Iranian stroke model-how to involve health policymakers
Stroke in Iran, with more than 83 million population, is a leading cause of disability and mortality in adults. Stroke has higher incidence in Iran comparing the global situation a...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
HIPERTENSI, USIA, JENIS KELAMIN DAN KEJADIAN STROKE DI RUANG RAWAT INAP STROKE RSUD dr. M. YUNUS BENGKULU
HIPERTENSI, USIA, JENIS KELAMIN DAN KEJADIAN STROKE DI RUANG RAWAT INAP STROKE RSUD dr. M. YUNUS BENGKULU
Hypertension, Age, Sex, and  Stroke  Incidence In Stroke Installation Room RSUD dr. M. Yunus BengkuluABSTRAKStroke adalah gejala-gejala defisit fungsi susunan saraf yang diakibatka...
Heterogeneity among women with stroke: health, demographic and healthcare utilization differentials
Heterogeneity among women with stroke: health, demographic and healthcare utilization differentials
Abstract Background Although age specific stroke rates are higher in men, women have a higher lifetime risk and are more likely to die from a stroke...
Comparative Characterization of Candidate Molecular Markers in Ischemic and Hemorrhagic Stroke
Comparative Characterization of Candidate Molecular Markers in Ischemic and Hemorrhagic Stroke
According to epidemiological studies, the leading cause of morbidity, disability and mortality are cerebrovascular diseases, in particular ischemic and hemorrhagic strokes. In rece...
The State of Stroke in Somalia: Scoping Review
The State of Stroke in Somalia: Scoping Review
Background: Stroke is a leading cause of death and disability globally, with limited data available on its burden in Somalia. Stroke presents a significant public health concern in...

Back to Top