Javascript must be enabled to continue!
An Investigation of Bias in Bangla Text Classification Models
View through CrossRef
Abstract
The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models. Despite significant advancements, the evaluation of bias and fairness in Bangla text classification remains underexplored. This study investigates model bias in Bangla text classification models, focusing on key fairness metrics such as Demographic Parity, Equalized Odds, and Accuracy Parity. We analyze the performance of widely used models, including Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), LSTM and Bangla-BERT, on a comprehensive dataset. The results reveal disparities in fairness across models, with Bangla-BERT achieving the highest fairness scores but still exhibiting measurable bias. To address this, we conduct an error analysis, highlighting the prevalence of bias-induced misclassifications across sensitive attributes. Additionally, we propose actionable recommendations to enhance fairness in Bangla NLP models, bridging gaps in ethical AI for low-resource languages. Our findings provide valuable insights for developing more equitable Bangla text classification systems and emphasize the need for fairness-aware methodologies in future NLP research.
Title: An Investigation of Bias in Bangla Text Classification Models
Description:
Abstract
The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models.
Despite significant advancements, the evaluation of bias and fairness in Bangla text classification remains underexplored.
This study investigates model bias in Bangla text classification models, focusing on key fairness metrics such as Demographic Parity, Equalized Odds, and Accuracy Parity.
We analyze the performance of widely used models, including Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), LSTM and Bangla-BERT, on a comprehensive dataset.
The results reveal disparities in fairness across models, with Bangla-BERT achieving the highest fairness scores but still exhibiting measurable bias.
To address this, we conduct an error analysis, highlighting the prevalence of bias-induced misclassifications across sensitive attributes.
Additionally, we propose actionable recommendations to enhance fairness in Bangla NLP models, bridging gaps in ethical AI for low-resource languages.
Our findings provide valuable insights for developing more equitable Bangla text classification systems and emphasize the need for fairness-aware methodologies in future NLP research.
Related Results
Parsing Bangla Grammar Using Context Free Grammar
Parsing Bangla Grammar Using Context Free Grammar
Parsing plays a very prominent role in computational linguistics. Parsing a Bangla sentence is a primary need in Bangla language processing. This chapter describes the Context Free...
Parsing Bangla Grammar Using Context Free Grammar
Parsing Bangla Grammar Using Context Free Grammar
Parsing plays a very prominent role in computational linguistics. Parsing a Bangla sentence is a primary need in Bangla language processing. This chapter describes the Context Free...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
Exploring the topical structure of short text through probability models : from tasks to fundamentals
Exploring the topical structure of short text through probability models : from tasks to fundamentals
Recent technological advances have radically changed the way we communicate. Today’s
communication has become ubiquitous and it has fostered the need for information that is easie...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Sentiment Recognition from Bangla Text
Sentiment Recognition from Bangla Text
Sentiment analysis is a very important area of the natural language processing. In general, sentiment classification means the analysis to determine the expression of a speaker whe...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND
Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
When is R[θ] integrally closed?
When is R[θ] integrally closed?
Let [Formula: see text] be an integrally closed domain with quotient field [Formula: see text] and [Formula: see text] be an element of an integral domain containing [Formula: see ...

