Javascript must be enabled to continue!
High Dimensional Computing on Arabic Language Classification
View through CrossRef
Abstract
The brain circuit is enormous regarding quantities of neurons and neuro-transmitters, proposing that huge circuits are the main entity to the brain-core processing. Hyper-Dimensional Processing depends on the understanding that minds register with examples of neural action that are not promptly connected with quantities. Truth be told, the mind capacity to analyze with numbers is weak and, in any case, because of the exceptionally large circuits, a neural processing models are replicated with purposes of a large dimensional portion, that is, with hyper-vectors. At the point when the dimension (D) is in the large numbers (For example, D is equal to ten thousand) it is called hyper dimensional. Hyper-vectors are holographic and randomly processed with independent-and-identically-distributed tools. A hyper-vector includes whole data merged as well as spread over the entirety of its pieces in a full all-encompassing portrayal, so no spot is more dependable to store any snippet of data compare to others. Hyper-vectors are joined with tasks likened to expansion, increase and change that structure numerical processing on vector region. Hyper-Vectors are intended to analyze for closeness utilizing a separation metric over the vector-region. These activities are nothing but hyper-vectors, in which it can be joined into intriguing processing conduct with novel highlights that cause them vigorous and proficient. This paper focuses on a utilization of hyper-dimensional processing for distinguishing the language of text-tests, in view of encoding sequential letters into hyper-vectors. Perceiving the language of a given book is the initial phase in all sorts of language handling, for example, text examination, arrangement, interpretation, and so forth. High dimension vectors model is mainstream in Natural Language Processing and are utilized to catch word significance from word use-insights. In this paper, first task is high dimensional computing based classification on Arabic datasets which contain three datasets such as arabiya, khaleej and akhbarona. High dimensional computing is applied to obtain the result from previous dataset when it is applied to N-gram encoding. The accuracy of high computing when utilizing SANAD Single-label Arabic news articles datasets with 12 N-gram encoding is 0.9665 %. The high dimensional computing with 6 N-gram encoding when utilizing RTA dataset provide the accuracy of 0.6648%. ANT dataset with 12 N-gram encoding when high dimensional computing is applied to give us accuracy 0.9248 %. The second task is applying high dimensional computing on Arabic language recognition for Levantine dialects three dataset is utilized. The first dataset is SDC Shami Dialects Corpus which contain Jordanian, Lebanese, Palestinian and Syrian that provide an accuracy of 0.8234% when applied high dimensional computing with 7 N-gram encoding. PADIC (Parallel Arabic DIalect Corpus) is the second dataset which contains Syria and Palestine Arabic dialects provide an accuracy of 0.7458 % when applied high dimensional computing with 5 N-gram encoding. The high dimensional computing when applied to third dataset MADAR (Multi-Arabic Dialect Applications and Resources) with 6 N-gram encoding provide us accuracy 0.7800%.
Research Square Platform LLC
Title: High Dimensional Computing on Arabic Language Classification
Description:
Abstract
The brain circuit is enormous regarding quantities of neurons and neuro-transmitters, proposing that huge circuits are the main entity to the brain-core processing.
Hyper-Dimensional Processing depends on the understanding that minds register with examples of neural action that are not promptly connected with quantities.
Truth be told, the mind capacity to analyze with numbers is weak and, in any case, because of the exceptionally large circuits, a neural processing models are replicated with purposes of a large dimensional portion, that is, with hyper-vectors.
At the point when the dimension (D) is in the large numbers (For example, D is equal to ten thousand) it is called hyper dimensional.
Hyper-vectors are holographic and randomly processed with independent-and-identically-distributed tools.
A hyper-vector includes whole data merged as well as spread over the entirety of its pieces in a full all-encompassing portrayal, so no spot is more dependable to store any snippet of data compare to others.
Hyper-vectors are joined with tasks likened to expansion, increase and change that structure numerical processing on vector region.
Hyper-Vectors are intended to analyze for closeness utilizing a separation metric over the vector-region.
These activities are nothing but hyper-vectors, in which it can be joined into intriguing processing conduct with novel highlights that cause them vigorous and proficient.
This paper focuses on a utilization of hyper-dimensional processing for distinguishing the language of text-tests, in view of encoding sequential letters into hyper-vectors.
Perceiving the language of a given book is the initial phase in all sorts of language handling, for example, text examination, arrangement, interpretation, and so forth.
High dimension vectors model is mainstream in Natural Language Processing and are utilized to catch word significance from word use-insights.
In this paper, first task is high dimensional computing based classification on Arabic datasets which contain three datasets such as arabiya, khaleej and akhbarona.
High dimensional computing is applied to obtain the result from previous dataset when it is applied to N-gram encoding.
The accuracy of high computing when utilizing SANAD Single-label Arabic news articles datasets with 12 N-gram encoding is 0.
9665 %.
The high dimensional computing with 6 N-gram encoding when utilizing RTA dataset provide the accuracy of 0.
6648%.
ANT dataset with 12 N-gram encoding when high dimensional computing is applied to give us accuracy 0.
9248 %.
The second task is applying high dimensional computing on Arabic language recognition for Levantine dialects three dataset is utilized.
The first dataset is SDC Shami Dialects Corpus which contain Jordanian, Lebanese, Palestinian and Syrian that provide an accuracy of 0.
8234% when applied high dimensional computing with 7 N-gram encoding.
PADIC (Parallel Arabic DIalect Corpus) is the second dataset which contains Syria and Palestine Arabic dialects provide an accuracy of 0.
7458 % when applied high dimensional computing with 5 N-gram encoding.
The high dimensional computing when applied to third dataset MADAR (Multi-Arabic Dialect Applications and Resources) with 6 N-gram encoding provide us accuracy 0.
7800%.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Effective Arabic Language Teaching Strategies in the Language Laboratory for Students of Darussalam Gontor Islamic Institution
Effective Arabic Language Teaching Strategies in the Language Laboratory for Students of Darussalam Gontor Islamic Institution
Language is an important tool for the life of civilized man. Through language, people can communicate with each other, and convey their intentions and feelings to others. The moder...
قصيد”اللغة العربية تنعى حظها بين أهلها“ لحافظ ابراهيم: دراسة تحليلية
قصيد”اللغة العربية تنعى حظها بين أهلها“ لحافظ ابراهيم: دراسة تحليلية
Many Languages are spoken in the world. The diversity of human languages and colors are sign of Allah, for those of knowledge (Al-Quran, 30:22). Although the Arabic language origin...
Arabic Natural Language Processing
Arabic Natural Language Processing
The Arabic language presents researchers and developers of natural language processing (NLP) applications for Arabic text and speech with serious challenges. The purpose of this ar...
Arabic Learning for Academic Purposes
Arabic Learning for Academic Purposes
This study aimed to determine the goal of teaching Arabic for Academic purposes. Teaching Arabic for non-Arabic speakers is generally divided into two types: Arabic language for li...
Research on the formulation and effects of teaching methods that enhance learning motivation in Arabic language
Research on the formulation and effects of teaching methods that enhance learning motivation in Arabic language
Although Arabic language learners and opportunities for learning the language gradually increased in Japan from 1925 when formal Arabic education began until the beginning of the 2...
Using Diacritics in the Arabic Script of Malay to Scaffold Arab Postgraduate Students in Reading Malay Words
Using Diacritics in the Arabic Script of Malay to Scaffold Arab Postgraduate Students in Reading Malay Words
Purpose – This study aims to investigate the use of diacritics in the Arabic script of Malay to facilitate Arab postgraduate students of UKM to read the Malay words accurately. It ...
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
<span lang="EN-MY">In this paper, a wideband millimeter-wave (mm-Wave) printed dipole antenna is proposed to be used for fifth generation (5G) communications. The single elem...


