Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Ensemble Method for Indonesian Twitter Hate Speech Detection

View through CrossRef
Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing. Social media and microblogging web services, such as Twitter, allowing to read and analyze user tweets in near real time. Twitter is a logical source of data for hate speech analysis since users of twitter are more likely to express their emotions of an event by posting some tweet. This analysis can help for early identification of hate speech so it can be prevented to be spread widely. The manual way of classifying out hateful contents in twitter is costly and not scalable. Therefore, the automatic way of hate speech detection is needed to be developed for tweets in Indonesian language. In this study, we used ensemble method for hate speech detection in Indonesian language. We employed five stand-alone classification algorithms, including Naïve Bayes, K-Nearest Neighbours, Maximum Entropy, Random Forest, and Support Vector Machines, and two ensemble methods, hard voting and soft voting, on Twitter hate speech dataset. The experiment results showed that using ensemble method can improve the classification performance. The best result is achieved when using soft voting with F1 measure 79.8% on unbalance dataset and 84.7% on balanced dataset. Although the improvement is not truly remarkable, using ensemble method can reduce the jeopardy of choosing a poor classifier to be used for detecting new tweets as hate speech or not.
Title: Ensemble Method for Indonesian Twitter Hate Speech Detection
Description:
Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing.
Social media and microblogging web services, such as Twitter, allowing to read and analyze user tweets in near real time.
Twitter is a logical source of data for hate speech analysis since users of twitter are more likely to express their emotions of an event by posting some tweet.
This analysis can help for early identification of hate speech so it can be prevented to be spread widely.
The manual way of classifying out hateful contents in twitter is costly and not scalable.
Therefore, the automatic way of hate speech detection is needed to be developed for tweets in Indonesian language.
In this study, we used ensemble method for hate speech detection in Indonesian language.
We employed five stand-alone classification algorithms, including Naïve Bayes, K-Nearest Neighbours, Maximum Entropy, Random Forest, and Support Vector Machines, and two ensemble methods, hard voting and soft voting, on Twitter hate speech dataset.
The experiment results showed that using ensemble method can improve the classification performance.
The best result is achieved when using soft voting with F1 measure 79.
8% on unbalance dataset and 84.
7% on balanced dataset.
Although the improvement is not truly remarkable, using ensemble method can reduce the jeopardy of choosing a poor classifier to be used for detecting new tweets as hate speech or not.

Related Results

Vihapuheen kohteet ja teemat sekä lajit ja muodot ennen ja nyt
Vihapuheen kohteet ja teemat sekä lajit ja muodot ennen ja nyt
Tässä artikkelissa on analysoitu vihapuheen olemusta ja puhunnan muotoja 1930- ja 2000-luvuilla. Tavoitteena on ollut etsiä niitä yhtäläisyyksiä ja eroja, joita kahdella eri aikaka...
Bilingual Hate Speech Detection on Social Media : Amharic and Afaan Oromo
Bilingual Hate Speech Detection on Social Media : Amharic and Afaan Oromo
Abstract Due to significant increases in internet penetration and the development of smartphone technology during the preceding couple of decades, many people have started ...
From Hate Crime to Disability Hate Crime
From Hate Crime to Disability Hate Crime
This chapter traces the journey from hate crime to Disability Hate Crime through an analysis of the relevant literature including policy related documents which construct and refer...
Kajian Kriminologi Tindakan Hate Speech Akun Fufufafa dan Penerapan Hukum Pidana
Kajian Kriminologi Tindakan Hate Speech Akun Fufufafa dan Penerapan Hukum Pidana
Abstract. The advancement of information and communication technology has given rise to the cyber era, transforming the way society interacts, including how individuals express the...
Modeling and Analysis of Hate speech Propagation in a Community using Fractional Order Derivatives
Modeling and Analysis of Hate speech Propagation in a Community using Fractional Order Derivatives
Abstract The propagation of hate speech directed toward local public sector administrations in a community has become an issue of great concern. Hate speech not only underm...
Countering hate speech: modeling user-generated web content using natural language processing
Countering hate speech: modeling user-generated web content using natural language processing
Social media is considered a particularly conducive arena for hate speech. Counter speech, which is a "direct response that counters hate speech" is a remedy to address hate speech...
Hate Speech and Abusive Language and Abusive Language Detection in Twitter using Machine Learning
Hate Speech and Abusive Language and Abusive Language Detection in Twitter using Machine Learning
Twitter's central goal is to enable everybody to make and share thoughts and data, and to communicate their suppositions and convictions without boundaries. Twitter's job is to ser...
Twitter Hate Aspect Extraction Using Association Analysis and Dictionary-Based Approach
Twitter Hate Aspect Extraction Using Association Analysis and Dictionary-Based Approach
Recent research regarding hate speech is in the domain of social sciences and psychology. From these trends, the dissemination of hate speech and antagonistic content in social med...

Back to Top