Javascript must be enabled to continue!

Weighted nearest neighbors and radius oversampling for imbalanced data classification

The challenges associated with high-dimensional and imbalanced datasets were observed to often lead to a degradation in the performance of classical machine learning algorithms. In the case of high dimensional data, not all features contribute significantly and are considered relevant to the performance of the model. Therefore, this study introduced a novel method called feature weighted variance analysis-nearest neighbors (WFVANN) which was developed on the foundation of k-nearest neighbors (KNN). The process involved modifying the calculation of the Euclidean distance by fully considering the relevance and contribution levels of features based on their Fvalue. WFVANN at the algorithmic level processing and radius-synthetic minority oversampling technique (R-SMOTE) at the data level processing used as the oversampling method later became the proposed model to solve the aforementioned issues. Moreover, extensive experiments were conducted on two distinct types of data including the high-dimensional and imbalanced by comparing WFVANN with the state-of-art KNN-based and synthetic minority oversampling technique (SMOTE)-based methods. The results showed that the proposed method had the highest accuracy, precision, recall, and F1-measure values across the majority of test datasets and outperformed the other methods.

Institute of Advanced Engineering and Science

Gede Angga Pradipta Putu Desiana Wulaning Ayu Made Liandana Dandy Pramana Hostiadi

IAES International Journal of Artificial Intelligence (IJ-AI)

2024

Title: Weighted nearest neighbors and radius oversampling for imbalanced data classification

Description:

The challenges associated with high-dimensional and imbalanced datasets were observed to often lead to a degradation in the performance of classical machine learning algorithms.

In the case of high dimensional data, not all features contribute significantly and are considered relevant to the performance of the model.

Therefore, this study introduced a novel method called feature weighted variance analysis-nearest neighbors (WFVANN) which was developed on the foundation of k-nearest neighbors (KNN).

The process involved modifying the calculation of the Euclidean distance by fully considering the relevance and contribution levels of features based on their Fvalue.

WFVANN at the algorithmic level processing and radius-synthetic minority oversampling technique (R-SMOTE) at the data level processing used as the oversampling method later became the proposed model to solve the aforementioned issues.

Moreover, extensive experiments were conducted on two distinct types of data including the high-dimensional and imbalanced by comparing WFVANN with the state-of-art KNN-based and synthetic minority oversampling technique (SMOTE)-based methods.

The results showed that the proposed method had the highest accuracy, precision, recall, and F1-measure values across the majority of test datasets and outperformed the other methods.

Back

Abstract For the last two decades, oversampling has been employed to overcome the challenge of learning from imbalanced datasets. Many approaches to solving this challenge ...

Advanced Re-Sampling Techniques for Multi-Class Imbalanced Classification

Imbalanced classification is a common problem in machine learning, where one class significantly outnumbers the others. This imbalance leads to biased model performance, where the ...

Optimasi Data Tidak Seimbang pada Interaksi Drug Target dengan Sampling dan Ensemble Support Vector Machine

<p>Data tidak seimbang menjadi salah satu masalah yang muncul pada masalah prediksi atau klasifikasi. Penelitian ini memfokuskan untuk mengatasi masalah data tidak seimbang p...

An Oversampling Technique with Descriptive Statistics

Oversampling is often applied as a means to win a better knowledge model. Several oversampling methods based on synthetic instances have been suggested, and SMOTE is one of the rep...

PANORAMIC REVIEW OF DISTAL RADIUS FRACTURES

Introduction: fractures affecting the distal radius are common, their incidence increases as life expectancy increases, leading to a larger population of individuals at risk of suf...

Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia

Obesity among adolescent is a public health issue with increasing burden of disease. Predicting imbalanced health data with Machine Learning may introduce bias and lead to diminish...

ANALISIS PENGARUH RADIUS BENDING PADA PROSES BENDING MENGGUNAKAN PELAT SPCC-SD TERHADAP PERUBAHAN STRUKTUR MIKRO

Paper ini membahas tentang pengaruh radius bending pada kualitas hasil bending dan pengaruhnya terhadap perubahan microstructure. Proses bending merupakan salah satu proses pembent...

Performance evaluation of text augmentation methods with BERT on imbalanced datasets

[EMBARGOED UNTIL 6/1/2023] Recently deep learning methods have achieved great success in understanding and analyzing text messages. In real-world applications, however, labeled tex...

Email:
Password:

Email:

Weighted nearest neighbors and radius oversampling for imbalanced data classification

Related Results