Javascript must be enabled to continue!

Roman Urdu Hate Speech Detection Using Transformer-Based Model for Cyber Security Applications

Social media applications, such as Twitter and Facebook, allow users to communicate and share their thoughts, status updates, opinions, photographs, and videos around the globe. Unfortunately, some people utilize these platforms to disseminate hate speech and abusive language. The growth of hate speech may result in hate crimes, cyber violence, and substantial harm to cyberspace, physical security, and social safety. As a result, hate speech detection is a critical issue for both cyberspace and physical society, necessitating the development of a robust application capable of detecting and combating it in real-time. Hate speech detection is a context-dependent problem that requires context-aware mechanisms for resolution. In this study, we employed a transformer-based model for Roman Urdu hate speech classification due to its ability to capture the text context. In addition, we developed the first Roman Urdu pre-trained BERT model, which we named BERT-RU. For this purpose, we exploited the capabilities of BERT by training it from scratch on the largest Roman Urdu dataset consisting of 173,714 text messages. Traditional and deep learning models were used as baseline models, including LSTM, BiLSTM, BiLSTM + Attention Layer, and CNN. We also investigated the concept of transfer learning by using pre-trained BERT embeddings in conjunction with deep learning models. The performance of each model was evaluated in terms of accuracy, precision, recall, and F-measure. The generalization of each model was evaluated on a cross-domain dataset. The experimental results revealed that the transformer-based model, when directly applied to the classification task of the Roman Urdu hate speech, outperformed traditional machine learning, deep learning models, and pre-trained transformer-based models in terms of accuracy, precision, recall, and F-measure, with scores of 96.70%, 97.25%, 96.74%, and 97.89%, respectively. In addition, the transformer-based model exhibited superior generalization on a cross-domain dataset.

MDPI AG

Muhammad Bilal Atif Khan Salman Jan Shahrulniza Musa Shaukat Ali

Sensors

2023

Title: Roman Urdu Hate Speech Detection Using Transformer-Based Model for Cyber Security Applications

Description:

Social media applications, such as Twitter and Facebook, allow users to communicate and share their thoughts, status updates, opinions, photographs, and videos around the globe.

Unfortunately, some people utilize these platforms to disseminate hate speech and abusive language.

The growth of hate speech may result in hate crimes, cyber violence, and substantial harm to cyberspace, physical security, and social safety.

As a result, hate speech detection is a critical issue for both cyberspace and physical society, necessitating the development of a robust application capable of detecting and combating it in real-time.

Hate speech detection is a context-dependent problem that requires context-aware mechanisms for resolution.

In this study, we employed a transformer-based model for Roman Urdu hate speech classification due to its ability to capture the text context.

In addition, we developed the first Roman Urdu pre-trained BERT model, which we named BERT-RU.

For this purpose, we exploited the capabilities of BERT by training it from scratch on the largest Roman Urdu dataset consisting of 173,714 text messages.

Traditional and deep learning models were used as baseline models, including LSTM, BiLSTM, BiLSTM + Attention Layer, and CNN.

We also investigated the concept of transfer learning by using pre-trained BERT embeddings in conjunction with deep learning models.

The performance of each model was evaluated in terms of accuracy, precision, recall, and F-measure.

The generalization of each model was evaluated on a cross-domain dataset.

The experimental results revealed that the transformer-based model, when directly applied to the classification task of the Roman Urdu hate speech, outperformed traditional machine learning, deep learning models, and pre-trained transformer-based models in terms of accuracy, precision, recall, and F-measure, with scores of 96.

70%, 97.

25%, 96.

74%, and 97.

89%, respectively.

In addition, the transformer-based model exhibited superior generalization on a cross-domain dataset.

Back

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

Mapping the scientific knowledge and approaches to defining and measuring hate crime, hate speech, and hate incidents: A systematic review

Abstract Background The difficulties in defining hate crime, hate incidents and hate speech, and in finding a common conc...

Vihapuheen kohteet ja teemat sekä lajit ja muodot ennen ja nyt

Tässä artikkelissa on analysoitu vihapuheen olemusta ja puhunnan muotoja 1930- ja 2000-luvuilla. Tavoitteena on ollut etsiä niitä yhtäläisyyksiä ja eroja, joita kahdella eri aikaka...

Hate Speech Detection Using Textual and User Features

Social media platforms provide users with a powerful platform to share their ideas. Using one’s right to expression to incite hatred toward a particular group of people ...

Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Forensic Linguistics of Hate Speech on Social Media against President Joko Widodo by Chairman of UGM’s Student Executive Board

This research discusses the hate speech delivered by the chairman of BEM UGM against President Joko Widodo, uploaded on social media. This research uses a forensic linguistic appro...

Automatic Load Sharing of Transformer

Transformer plays a major role in the power system. It works 24 hours a day and provides power to the load. The transformer is excessive full, its windings are overheated which lea...

THE EVOLUTION OF CYBER RESILIENCE FRAMEWORKS IN NETWORK SECURITY: A CONCEPTUAL ANALYSIS

The Evolution of Cyber Resilience Frameworks in Network Security: A Conceptual Analysis provides a comprehensive overview of the development and application of cyber resilience fra...

Email:
Password:

Email:

Roman Urdu Hate Speech Detection Using Transformer-Based Model for Cyber Security Applications

Related Results