Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

An Investigation of Bias in Bangla Text Classification Models

View through CrossRef
Abstract The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models. Despite significant advancements, the evaluation of bias and fairness in Bangla text classification remains underexplored. This study investigates model bias in Bangla text classification models, focusing on key fairness metrics such as Demographic Parity, Equalized Odds, and Accuracy Parity. We analyze the performance of widely used models, including Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), LSTM and Bangla-BERT, on a comprehensive dataset. The results reveal disparities in fairness across models, with Bangla-BERT achieving the highest fairness scores but still exhibiting measurable bias. To address this, we conduct an error analysis, highlighting the prevalence of bias-induced misclassifications across sensitive attributes. Additionally, we propose actionable recommendations to enhance fairness in Bangla NLP models, bridging gaps in ethical AI for low-resource languages. Our findings provide valuable insights for developing more equitable Bangla text classification systems and emphasize the need for fairness-aware methodologies in future NLP research.
Springer Science and Business Media LLC
Title: An Investigation of Bias in Bangla Text Classification Models
Description:
Abstract The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models.
Despite significant advancements, the evaluation of bias and fairness in Bangla text classification remains underexplored.
This study investigates model bias in Bangla text classification models, focusing on key fairness metrics such as Demographic Parity, Equalized Odds, and Accuracy Parity.
We analyze the performance of widely used models, including Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), LSTM and Bangla-BERT, on a comprehensive dataset.
The results reveal disparities in fairness across models, with Bangla-BERT achieving the highest fairness scores but still exhibiting measurable bias.
To address this, we conduct an error analysis, highlighting the prevalence of bias-induced misclassifications across sensitive attributes.
Additionally, we propose actionable recommendations to enhance fairness in Bangla NLP models, bridging gaps in ethical AI for low-resource languages.
Our findings provide valuable insights for developing more equitable Bangla text classification systems and emphasize the need for fairness-aware methodologies in future NLP research.

Related Results

E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
Exploring the topical structure of short text through probability models : from tasks to fundamentals
Exploring the topical structure of short text through probability models : from tasks to fundamentals
Recent technological advances have radically changed the way we communicate. Today’s communication has become ubiquitous and it has fostered the need for information that is easie...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
The relatively weak sea surface temperature bias in the tropical Indian Ocean (TIO) simulated in the coupledgeneral circulation model (CGCM) from the recently released CMIP6 has be...
Optimising tool wear and workpiece condition monitoring via cyber-physical systems for smart manufacturing
Optimising tool wear and workpiece condition monitoring via cyber-physical systems for smart manufacturing
Smart manufacturing has been developed since the introduction of Industry 4.0. It consists of resource sharing and networking, predictive engineering, and material and data analyti...
Λc Physics at BESIII
Λc Physics at BESIII
In 2014 BESIII collected a data sample of 567 [Formula: see text] at [Formula: see text] = 4.6 GeV, which is just above the [Formula: see text] pair production threshold. By analyz...
Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings
Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings
Due to massive adoption of social media, detection of users’ depression through social media analytics bears significant importance, particularly for underrepresented languages, su...

Back to Top