Javascript must be enabled to continue!
Comparative Analysis of Resampling Techniques for Class Imbalance in Financial Distress Prediction Using XGBoost
View through CrossRef
One of the key challenges in financial distress data is class imbalance, where the data are characterized by a highly imbalanced ratio between the number of distressed and non-distressed samples. This study examines eight resampling techniques for improving distress prediction using the XGBoost algorithm. The study was performed on a dataset acquired from the CSMAR database, containing 26,383 firm-quarter samples from 639 Chinese A-share listed companies (2007–2024), with only 12.1% of the cases being distressed. Results show that standard Synthetic Minority Oversampling Technique (SMOTE) enhanced F1-score (up to 0.73) and Matthews Correlation Coefficient (MCC, up to 0.70), while SMOTE-Tomek and Borderline-SMOTE further boosted recall, slightly sacrificing precision. These oversampling and hybrid methods also maintained reasonable computational efficiency. However, Random Undersampling (RUS), though yielding high recall (0.85), suffered from low precision (0.46) and weaker generalization, but was the fastest method. Among all techniques, Bagging-SMOTE achieved balanced performance (AUC 0.96, F1 0.72, PR-AUC 0.80, MCC 0.68) using a minority-to-majority ratio of 0.15, demonstrating that ensemble-based resampling can improve robustness with minimal impact on the original class distribution, albeit with higher computational cost. The compared findings highlight that no single approach fits all use cases, and technique selection should align with specific goals. Techniques favoring recall (e.g., Bagging-SMOTE, SMOTE-Tomek) are suited for early warning, while conservative techniques (e.g., Tomek Links) help reduce false positives in risk-sensitive applications, and efficient methods such as RUS are preferable when computational speed is a priority.
Title: Comparative Analysis of Resampling Techniques for Class Imbalance in Financial Distress Prediction Using XGBoost
Description:
One of the key challenges in financial distress data is class imbalance, where the data are characterized by a highly imbalanced ratio between the number of distressed and non-distressed samples.
This study examines eight resampling techniques for improving distress prediction using the XGBoost algorithm.
The study was performed on a dataset acquired from the CSMAR database, containing 26,383 firm-quarter samples from 639 Chinese A-share listed companies (2007–2024), with only 12.
1% of the cases being distressed.
Results show that standard Synthetic Minority Oversampling Technique (SMOTE) enhanced F1-score (up to 0.
73) and Matthews Correlation Coefficient (MCC, up to 0.
70), while SMOTE-Tomek and Borderline-SMOTE further boosted recall, slightly sacrificing precision.
These oversampling and hybrid methods also maintained reasonable computational efficiency.
However, Random Undersampling (RUS), though yielding high recall (0.
85), suffered from low precision (0.
46) and weaker generalization, but was the fastest method.
Among all techniques, Bagging-SMOTE achieved balanced performance (AUC 0.
96, F1 0.
72, PR-AUC 0.
80, MCC 0.
68) using a minority-to-majority ratio of 0.
15, demonstrating that ensemble-based resampling can improve robustness with minimal impact on the original class distribution, albeit with higher computational cost.
The compared findings highlight that no single approach fits all use cases, and technique selection should align with specific goals.
Techniques favoring recall (e.
g.
, Bagging-SMOTE, SMOTE-Tomek) are suited for early warning, while conservative techniques (e.
g.
, Tomek Links) help reduce false positives in risk-sensitive applications, and efficient methods such as RUS are preferable when computational speed is a priority.
Related Results
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Benchmarking Bayesian methods for spectroscopy
Benchmarking Bayesian methods for spectroscopy
<p class="p1"><span class="s1"><strong>Introduction:</strong></span>&l...
Fuze Well Mechanical Interface
Fuze Well Mechanical Interface
<div class="section abstract">
<div class="htmlview paragraph">This interface standard applies to fuzes used in airborne weapons that use a 3-Inch Fuze Well. It defin...
Cybersecurity Guidebook for Cyber-Physical Vehicle Systems
Cybersecurity Guidebook for Cyber-Physical Vehicle Systems
<div class="section abstract">
<div class="htmlview paragraph">This recommended practice provides guidance on vehicle Cybersecurity and was created based off of, and ...
Cybersecurity Guidebook for Cyber-Physical Vehicle Systems
Cybersecurity Guidebook for Cyber-Physical Vehicle Systems
<div class="section abstract">
<div class="htmlview paragraph">This recommended practice provides guidance on vehicle Cybersecurity and was created based off of, and ...
Klasifikasi Status Indeks Desa Membangun Jawa Barat Menggunakan Algoritma XGBoost
Klasifikasi Status Indeks Desa Membangun Jawa Barat Menggunakan Algoritma XGBoost
Abstract. Based on data from Statistics Indonesia 2020 shows that rural areas in West Java have an average poverty rate of 10,64%, which is higher than urban areas at 7,79%. To est...
PENGARUH CURRENT RATIO (CR), DEBT TO EQUITY RATIO (DER), DAN RETURN ON ASSETS (ROA) TERHADAP FINANCIAL DISTRESS PADA PERUSAHAAN MANUFAKTUR SUB SEKTOR OTOMOTIF DAN KOMPONEN
PENGARUH CURRENT RATIO (CR), DEBT TO EQUITY RATIO (DER), DAN RETURN ON ASSETS (ROA) TERHADAP FINANCIAL DISTRESS PADA PERUSAHAAN MANUFAKTUR SUB SEKTOR OTOMOTIF DAN KOMPONEN
This study aims to explain the influence of Current Ratio (CR), Debt to Equity Ratio (DER), and Return On Assets (ROA) on Financial Distress. In this study, the method used is a qu...
Psychosocial Distress Among Cancer Patients: A single Institution Experience at the State of Qatar
Psychosocial Distress Among Cancer Patients: A single Institution Experience at the State of Qatar
Abstract
Introduction The prevalence of psychosocial distress is up to 45% among cancer patients. It is crucial to identify and treat distress. The aim of the study is to r...

