Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

An automated approach for binary classification on imbalanced data

View through CrossRef
Abstract Imbalanced data is present in various business areas and must be dealt with the appropriate resampling techniques and classification algorithms. However, there is a magnitude of multiple combinations of resampling and learning methods to handle imbalanced data that require specialised knowledge to be used correctly. In this paper, several approaches, ranging from more accessible and more advanced in the domains of data resampling and cost-sensitive techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suited combinations of techniques for a specific dataset, by extracting and comparing dataset meta-features values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results to a state-of-the-art solution and with a much smaller execution time.
Research Square Platform LLC
Title: An automated approach for binary classification on imbalanced data
Description:
Abstract Imbalanced data is present in various business areas and must be dealt with the appropriate resampling techniques and classification algorithms.
However, there is a magnitude of multiple combinations of resampling and learning methods to handle imbalanced data that require specialised knowledge to be used correctly.
In this paper, several approaches, ranging from more accessible and more advanced in the domains of data resampling and cost-sensitive techniques, will be considered to handle imbalanced data.
The application developed delivers recommendations of the most suited combinations of techniques for a specific dataset, by extracting and comparing dataset meta-features values recorded in a knowledge base.
It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results to a state-of-the-art solution and with a much smaller execution time.

Related Results

Improving Medical Document Classification via Feature Engineering
Improving Medical Document Classification via Feature Engineering
<p dir="ltr">Document classification (DC) is the task of assigning the predefined labels to unseen documents by utilizing the model trained on the available labeled documents...
Handling the Imbalanced Problem in Agri-Food Data Analysis
Handling the Imbalanced Problem in Agri-Food Data Analysis
Imbalanced data situations exist in most fields of endeavor. The problem has been identified as a major bottleneck in machine learning/data mining and is becoming a serious issue o...
Application of Machine Learning Techniques for Customer Churn Prediction in the Banking Sector
Application of Machine Learning Techniques for Customer Churn Prediction in the Banking Sector
Aim/Purpose: Previous studies have primarily focused on comparing predictive models without considering the impact of data preprocessing on model performance. Therefore, this study...
Competitive Indices in Cereal and Legume Mixtures in a South Asian Environment
Competitive Indices in Cereal and Legume Mixtures in a South Asian Environment
Core Ideas Cereal‐legume binary mixtures increased forage productivity per unit area compared to cereal‐cereal and legume‐legume binary mixtures. In binary mixtures, pearl millet w...
Weak tagging and imbalanced networks for online review sentiment classification
Weak tagging and imbalanced networks for online review sentiment classification
Sentiment classification aims to complete the automatic judgment task of text sentiment tendency. In the sentiment classification task of online reviews, traditional deep learning ...
Synthesis, characterization and application of novel ionic liquids
Synthesis, characterization and application of novel ionic liquids
Ionic liquids (ILs) or molten salts at room temperature presently experience significant attention in many areas of chemistry. The most attractive property is the “tenability” of t...
Enhancing Alzheimer’s disease classification through split federated learning and GANs for imbalanced datasets
Enhancing Alzheimer’s disease classification through split federated learning and GANs for imbalanced datasets
In the rapidly evolving healthcare sector, using advanced technologies to improve medical classification systems has become crucial for enhancing patient care, diagnosis, and treat...
Optimising tool wear and workpiece condition monitoring via cyber-physical systems for smart manufacturing
Optimising tool wear and workpiece condition monitoring via cyber-physical systems for smart manufacturing
Smart manufacturing has been developed since the introduction of Industry 4.0. It consists of resource sharing and networking, predictive engineering, and material and data analyti...

Back to Top