Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Weighted nearest neighbors and radius oversampling for imbalanced data classification

View through CrossRef
The challenges associated with high-dimensional and imbalanced datasets were observed to often lead to a degradation in the performance of classical machine learning algorithms. In the case of high dimensional data, not all features contribute significantly and are considered relevant to the performance of the model. Therefore, this study introduced a novel method called feature weighted variance analysis-nearest neighbors (WFVANN) which was developed on the foundation of k-nearest neighbors (KNN). The process involved modifying the calculation of the Euclidean distance by fully considering the relevance and contribution levels of features based on their Fvalue. WFVANN at the algorithmic level processing and radius-synthetic minority oversampling technique (R-SMOTE) at the data level processing used as the oversampling method later became the proposed model to solve the aforementioned issues. Moreover, extensive experiments were conducted on two distinct types of data including the high-dimensional and imbalanced by comparing WFVANN with the state-of-art KNN-based and synthetic minority oversampling technique (SMOTE)-based methods. The results showed that the proposed method had the highest accuracy, precision, recall, and F1-measure values across the majority of test datasets and outperformed the other methods.
Title: Weighted nearest neighbors and radius oversampling for imbalanced data classification
Description:
The challenges associated with high-dimensional and imbalanced datasets were observed to often lead to a degradation in the performance of classical machine learning algorithms.
In the case of high dimensional data, not all features contribute significantly and are considered relevant to the performance of the model.
Therefore, this study introduced a novel method called feature weighted variance analysis-nearest neighbors (WFVANN) which was developed on the foundation of k-nearest neighbors (KNN).
The process involved modifying the calculation of the Euclidean distance by fully considering the relevance and contribution levels of features based on their Fvalue.
WFVANN at the algorithmic level processing and radius-synthetic minority oversampling technique (R-SMOTE) at the data level processing used as the oversampling method later became the proposed model to solve the aforementioned issues.
Moreover, extensive experiments were conducted on two distinct types of data including the high-dimensional and imbalanced by comparing WFVANN with the state-of-art KNN-based and synthetic minority oversampling technique (SMOTE)-based methods.
The results showed that the proposed method had the highest accuracy, precision, recall, and F1-measure values across the majority of test datasets and outperformed the other methods.

Related Results

Stop Oversampling for Class Imbalance Learning: A Critical Review
Stop Oversampling for Class Imbalance Learning: A Critical Review
Abstract For the last two decades, oversampling has been employed to overcome the challenge of learning from imbalanced datasets. Many approaches to solving this challenge ...
Advanced Re-Sampling Techniques for Multi-Class Imbalanced Classification
Advanced Re-Sampling Techniques for Multi-Class Imbalanced Classification
Imbalanced classification is a common problem in machine learning, where one class significantly outnumbers the others. This imbalance leads to biased model performance, where the ...
Optimasi Data Tidak Seimbang pada Interaksi Drug Target dengan Sampling dan Ensemble Support Vector Machine
Optimasi Data Tidak Seimbang pada Interaksi Drug Target dengan Sampling dan Ensemble Support Vector Machine
<p>Data tidak seimbang menjadi salah satu masalah yang muncul pada masalah prediksi atau klasifikasi. Penelitian ini memfokuskan untuk mengatasi masalah data tidak seimbang p...
An Oversampling Technique with Descriptive Statistics
An Oversampling Technique with Descriptive Statistics
Oversampling is often applied as a means to win a better knowledge model. Several oversampling methods based on synthetic instances have been suggested, and SMOTE is one of the rep...
PANORAMIC REVIEW OF DISTAL RADIUS FRACTURES
PANORAMIC REVIEW OF DISTAL RADIUS FRACTURES
Introduction: fractures affecting the distal radius are common, their incidence increases as life expectancy increases, leading to a larger population of individuals at risk of suf...
Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia
Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia
Obesity among adolescent is a public health issue with increasing burden of disease. Predicting imbalanced health data with Machine Learning may introduce bias and lead to diminish...
ANALISIS PENGARUH RADIUS BENDING PADA PROSES BENDING MENGGUNAKAN PELAT SPCC-SD TERHADAP PERUBAHAN STRUKTUR MIKRO
ANALISIS PENGARUH RADIUS BENDING PADA PROSES BENDING MENGGUNAKAN PELAT SPCC-SD TERHADAP PERUBAHAN STRUKTUR MIKRO
Paper ini membahas tentang pengaruh radius bending pada kualitas hasil bending dan pengaruhnya terhadap perubahan microstructure. Proses bending merupakan salah satu proses pembent...
Performance evaluation of text augmentation methods with BERT on imbalanced datasets
Performance evaluation of text augmentation methods with BERT on imbalanced datasets
[EMBARGOED UNTIL 6/1/2023] Recently deep learning methods have achieved great success in understanding and analyzing text messages. In real-world applications, however, labeled tex...

Back to Top