Javascript must be enabled to continue!
Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia
View through CrossRef
Obesity among adolescent is a public health issue with increasing burden of disease. Predicting imbalanced health data with Machine Learning may introduce bias and lead to diminished model performance. Misclassification in healthcare data could lead to misdiagnosing a patient or failing to detect a health issue when it is present. The purpose of this study is to predict adolescent obesity using machine learning along with implementation of multiple approaches on the imbalanced dataset. This study used secondary dataset from National Health and Morbidity Survey 2017. Samples 13 – 17 years were selected for the classification. SPSS V26 was used for data pre-processing, data cleaning, and data analysis. Meanwhile, Python language used for prediction and evaluation of the models. Approaches on the imbalanced dataset including resampling method (Random Oversampling, Random Under-sampling) and hybrid method (SMOTE and ADASYN) were implemented. This dataset was used for the formation of predictive models on ML algorithm including Artificial Neural Network, Decision Tree, K-Nearest Neighbour, Logistic Regression, Naïve Bayes, Random Forest and Support Vector Machine. The performance of each model was evaluated and compared using accuracy, precision, recall, F- score and Area under the Curve (AUC). Random Oversampling approached with Decision Tree Algorithm performs the best with accuracy (91.35%), precision (0.93), recall (0.91), F- score (0.91) and AUC (0.91) for the prediction of obesity among adolescent in Malaysia. The presented ML model development workflow along with the imbalanced techniques can be adapted to other health survey-based studies and may be valuable for developing other clinical prediction models.
Akademia Baru Publishing
Title: Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia
Description:
Obesity among adolescent is a public health issue with increasing burden of disease.
Predicting imbalanced health data with Machine Learning may introduce bias and lead to diminished model performance.
Misclassification in healthcare data could lead to misdiagnosing a patient or failing to detect a health issue when it is present.
The purpose of this study is to predict adolescent obesity using machine learning along with implementation of multiple approaches on the imbalanced dataset.
This study used secondary dataset from National Health and Morbidity Survey 2017.
Samples 13 – 17 years were selected for the classification.
SPSS V26 was used for data pre-processing, data cleaning, and data analysis.
Meanwhile, Python language used for prediction and evaluation of the models.
Approaches on the imbalanced dataset including resampling method (Random Oversampling, Random Under-sampling) and hybrid method (SMOTE and ADASYN) were implemented.
This dataset was used for the formation of predictive models on ML algorithm including Artificial Neural Network, Decision Tree, K-Nearest Neighbour, Logistic Regression, Naïve Bayes, Random Forest and Support Vector Machine.
The performance of each model was evaluated and compared using accuracy, precision, recall, F- score and Area under the Curve (AUC).
Random Oversampling approached with Decision Tree Algorithm performs the best with accuracy (91.
35%), precision (0.
93), recall (0.
91), F- score (0.
91) and AUC (0.
91) for the prediction of obesity among adolescent in Malaysia.
The presented ML model development workflow along with the imbalanced techniques can be adapted to other health survey-based studies and may be valuable for developing other clinical prediction models.
Related Results
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Abstarct
Introduction
Isolated brain hydatid disease (BHD) is an extremely rare form of echinococcosis. A prompt and timely diagnosis is a crucial step in disease management. This ...
THE ‘PARENT’ IN THE PARENTING STYLE:
A CORRELATIONAL STUDY EXPLORING THE IMPACT OF PARENTING ON SELF-CONCEPT OF THE ADOLESCENT (Preprint)
THE ‘PARENT’ IN THE PARENTING STYLE:
A CORRELATIONAL STUDY EXPLORING THE IMPACT OF PARENTING ON SELF-CONCEPT OF THE ADOLESCENT (Preprint)
BACKGROUND
The present research attempts to explore the dynamics of parent child relationship. The investigation aims at understanding the impact of parenti...
Eating Habits Associated with Overweight and Obesity: Case - Control Study in 11-14 year old Adolescents in Hanoi in 2020
Eating Habits Associated with Overweight and Obesity: Case - Control Study in 11-14 year old Adolescents in Hanoi in 2020
Eating habits appears to be an important determinant of dietary intake and may consequently influence overweight and obesity. Understanding the relationship between the nutritional...
Abstract SY38-02: Clinical investigations of obesity in cancer: BMI and other confounders
Abstract SY38-02: Clinical investigations of obesity in cancer: BMI and other confounders
Abstract
Obesity has been linked with increased incidence and worse outcomes of at least 13 human cancers. For other cancers, our understanding of their relationship...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Incidence of Venous Thromboembolism in Patients with Different Classes of Obesity in Comparison to Inherited Thrombophilias
Incidence of Venous Thromboembolism in Patients with Different Classes of Obesity in Comparison to Inherited Thrombophilias
Introduction: Obesity is a significant risk factor for venous thromboembolism (VTE). It is linked to physical inactivity, increased intra-abdominal pressure, chronic inflammation, ...

