Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Extremism Detection in the Iraqi Dialect Based on Machine Learning

View through CrossRef
Extremism detection is an important area of natural language processing (NLP). It is used to detect hate speech, sectarianism, and terrorism on social media. This field has been discussed and studied in many international languages, especially Arabic and English, as many studies touched on languages in particular, but dialects were not addressed even though users of social networking sites write in their dialect. One of the most difficult Arabic dialects is the Iraqi dialect. Because the Iraqi dialect has few sources on the Internet regarding available data that can be used by researchers, this research aims to detect extremism in Iraqi texts using machine learning. The data was pre-processed by deleting suffixes and prefixes for Iraqi words, deleting repeated letters in the word, and deleting Iraqi stop words. Pre-trained embedding as well as embedding using Gensim Word2vec and FastText were used to represent the words in the embedding step. Also, four learning classifiers were used: Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN), and Gaussian Naive Bayes (GNB). The experiments were conducted on two Iraqi datasets collected from social media platforms related to extremism: the Iraqi Facebook Comments Dataset (IFCD) and the Iraqi Tweets Dataset (ITD). The performance of all models was evaluated using accuracy, macro-average precision, macro-average recall, and macro-average F1-score; the best F1-score is 0.9521, while recall and precision are 0.95 and 0.955, respectively. In addition, the models presented in this research were tested on an Iraqi data set related to hate speech available on the Internet, and the results obtained were compared with the results of the work that provided this data set.
Title: Extremism Detection in the Iraqi Dialect Based on Machine Learning
Description:
Extremism detection is an important area of natural language processing (NLP).
It is used to detect hate speech, sectarianism, and terrorism on social media.
This field has been discussed and studied in many international languages, especially Arabic and English, as many studies touched on languages in particular, but dialects were not addressed even though users of social networking sites write in their dialect.
One of the most difficult Arabic dialects is the Iraqi dialect.
Because the Iraqi dialect has few sources on the Internet regarding available data that can be used by researchers, this research aims to detect extremism in Iraqi texts using machine learning.
The data was pre-processed by deleting suffixes and prefixes for Iraqi words, deleting repeated letters in the word, and deleting Iraqi stop words.
Pre-trained embedding as well as embedding using Gensim Word2vec and FastText were used to represent the words in the embedding step.
Also, four learning classifiers were used: Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN), and Gaussian Naive Bayes (GNB).
The experiments were conducted on two Iraqi datasets collected from social media platforms related to extremism: the Iraqi Facebook Comments Dataset (IFCD) and the Iraqi Tweets Dataset (ITD).
The performance of all models was evaluated using accuracy, macro-average precision, macro-average recall, and macro-average F1-score; the best F1-score is 0.
9521, while recall and precision are 0.
95 and 0.
955, respectively.
In addition, the models presented in this research were tested on an Iraqi data set related to hate speech available on the Internet, and the results obtained were compared with the results of the work that provided this data set.

Related Results

A Study of the Chungcheong Dialect as a Literary Dialect in the Pansori Lyrics of Park Dongjin
A Study of the Chungcheong Dialect as a Literary Dialect in the Pansori Lyrics of Park Dongjin
This paper examines the Chungcheong dialect in Park Dongjin's pansori editorials from the perspective of “Literary Dialect,” focusing on phonological, morphological, and lexical is...
Functions and Translation of Palestinian Dialect in Ibrahim Nasrallah’s Time of White Horses
Functions and Translation of Palestinian Dialect in Ibrahim Nasrallah’s Time of White Horses
The problems that translators of fiction, especially novels, face when translating dialects from one language to another vary because dialects are distinct as much as cultures and ...
Muuttuva ja muuttumaton murre
Muuttuva ja muuttumaton murre
Murteet ovat kehittyneet kulttuuriperinnöksi ja identiteetin rakennuksen välineeksi pitkien prosessien seurauksena. Porin seudullakin murrekirjallisuudella ja murteen käytöllä on j...
Communication Strategies to Counter Violent Extremism in Pakistan
Communication Strategies to Counter Violent Extremism in Pakistan
Purpose - Violent extremism has disrupted the social harmony of many countries all over the globe. Pakistan has been marked as an extremist state, becoming one of Pakistan's bigges...
Domination of Polynomial with Application
Domination of Polynomial with Application
In this paper, .We .initiate the study of domination. polynomial , consider G=(V,E) be a simple, finite, and directed graph without. isolated. vertex .We present a study of the Ira...
Bukovyna dialect of the village Yuzhynets
Bukovyna dialect of the village Yuzhynets
The article deals with description of one dialect as a system. The purpose of of this study is to describe the main features of the dialect v. Yuzhynets, manifested in oral dialect...
EXPLORING THE DETERMINANTS OF EXTREMISM IN THE UNIVERSITIES OF PAKISTAN
EXPLORING THE DETERMINANTS OF EXTREMISM IN THE UNIVERSITIES OF PAKISTAN
This research explores various enticing factors that help to identify the root causes of extremism among youth, particularly in the higher educational institutions i.e., Universiti...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...

Back to Top