Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings

View through CrossRef
Due to massive adoption of social media, detection of users’ depression through social media analytics bears significant importance, particularly for underrepresented languages, such as Bangla. This study introduces a well-grounded approach to identify depressive social media posts in Bangla, by employing advanced natural language processing techniques. The dataset used in this work, annotated by domain experts, includes both depressive and non-depressive posts, ensuring high-quality data for model training and evaluation. To address the prevalent issue of class imbalance, we utilised random oversampling for the minority class, thereby enhancing the model's ability to accurately detect depressive posts. We explored various numerical representation techniques, including Term Frequency – Inverse Document Frequency (TF-IDF), Bidirectional Encoder Representations from Transformers (BERT) embedding and FastText embedding, by integrating them with a deep learning-based Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) model. The results obtained through extensive experimentation, indicate that the BERT approach performed better the others, achieving a F1-score of 84%. This indicates that BERT, in combination with the CNN-BiLSTM architecture, effectively recognises the nuances of Bangla texts relevant to depressive contents. Comparative analysis with the existing state-of-the-art methods demonstrates that our approach with BERT embedding performs better than others in terms of evaluation metrics and the reliability of dataset annotations. Our research significantly contribution to the development of reliable tools for detecting depressive posts in the Bangla language. By highlighting the efficacy of different embedding techniques and deep learning models, this study paves the way for improved mental health monitoring through social media platforms.
Title: Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings
Description:
Due to massive adoption of social media, detection of users’ depression through social media analytics bears significant importance, particularly for underrepresented languages, such as Bangla.
This study introduces a well-grounded approach to identify depressive social media posts in Bangla, by employing advanced natural language processing techniques.
The dataset used in this work, annotated by domain experts, includes both depressive and non-depressive posts, ensuring high-quality data for model training and evaluation.
To address the prevalent issue of class imbalance, we utilised random oversampling for the minority class, thereby enhancing the model's ability to accurately detect depressive posts.
We explored various numerical representation techniques, including Term Frequency – Inverse Document Frequency (TF-IDF), Bidirectional Encoder Representations from Transformers (BERT) embedding and FastText embedding, by integrating them with a deep learning-based Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) model.
The results obtained through extensive experimentation, indicate that the BERT approach performed better the others, achieving a F1-score of 84%.
This indicates that BERT, in combination with the CNN-BiLSTM architecture, effectively recognises the nuances of Bangla texts relevant to depressive contents.
Comparative analysis with the existing state-of-the-art methods demonstrates that our approach with BERT embedding performs better than others in terms of evaluation metrics and the reliability of dataset annotations.
Our research significantly contribution to the development of reliable tools for detecting depressive posts in the Bangla language.
By highlighting the efficacy of different embedding techniques and deep learning models, this study paves the way for improved mental health monitoring through social media platforms.

Related Results

Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Learned Text Representation for Amharic Information Retrieval and Natural Language Processing
Learned Text Representation for Amharic Information Retrieval and Natural Language Processing
Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations fo...
An Investigation of Bias in Bangla Text Classification Models
An Investigation of Bias in Bangla Text Classification Models
Abstract The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models. Despite signific...
Enhancing Customer Satisfaction Analysis Using Advanced Machine Learning Techniques in Fintech Industry
Enhancing Customer Satisfaction Analysis Using Advanced Machine Learning Techniques in Fintech Industry
Customer satisfaction (CSAT) is vital in service and marketing, indicating how well products or services meet customer expectations. Traditional CSAT methods like the American Cust...
Longitudinal association between depressive symptoms and self-directed passive aggression: A random intercept cross-lagged panel analysis
Longitudinal association between depressive symptoms and self-directed passive aggression: A random intercept cross-lagged panel analysis
AbstractBackgroundSelf-directed passive aggression (SD-PAB) is defined as any behaviour harming one-self by inactivity and omission of own needs. Depressive disorders are a severe ...
Evaluasi Pengukuran Semantik Sinonim KBBI Menggunakan Pendekatan Word Embedding
Evaluasi Pengukuran Semantik Sinonim KBBI Menggunakan Pendekatan Word Embedding
Kamus Besar Bahasa Indonesia (KBBI) ialah salah satu sumber utama penyedia data dalam penelitian penentuan kemiripan makna kata dalam bahasa Indonesia. Penelitian ini membahas cara...
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
Abstract Background: Pre-training large-scale neural language models on raw texts has been shown to make a significant contribution to a strategy for transfer learning in n...

Back to Top