Javascript must be enabled to continue!
Standardizing Darija: Collaborative approaches in the Moroccan Darija Wikipedia
View through CrossRef
This paper examines the development of Moroccan Darija Wikipedia since its launch in July 2020. It details the strategies employed by the Wikimedia Morocco user group, focusing on bot automation and editing contests, to foster growth within this low-resource language Wikipedia. The paper highlights the opportunities Darija Wikipedia presents for Artificial Intelligence research, particularly in Natural Language Processing, given its status as the largest online Darija dataset. It also explores how the standardization efforts undertaken by the user group enable valuable collaboration between volunteers, experts, and researchers, potentially setting a prece-dent for other similar language communities. Furthermore, the paper addresses key challenges, including ensuring community sustainability and mitigating vandalism, and analyzes the manifestation of diverse spelling conventions (phonetic, etymological) within the encyclopedia’s content.
Adam Mickiewicz University Poznan
Title: Standardizing Darija: Collaborative approaches in the Moroccan Darija Wikipedia
Description:
This paper examines the development of Moroccan Darija Wikipedia since its launch in July 2020.
It details the strategies employed by the Wikimedia Morocco user group, focusing on bot automation and editing contests, to foster growth within this low-resource language Wikipedia.
The paper highlights the opportunities Darija Wikipedia presents for Artificial Intelligence research, particularly in Natural Language Processing, given its status as the largest online Darija dataset.
It also explores how the standardization efforts undertaken by the user group enable valuable collaboration between volunteers, experts, and researchers, potentially setting a prece-dent for other similar language communities.
Furthermore, the paper addresses key challenges, including ensuring community sustainability and mitigating vandalism, and analyzes the manifestation of diverse spelling conventions (phonetic, etymological) within the encyclopedia’s content.
Related Results
MDVC corpus: empowering Moroccan Darija speech recognition
MDVC corpus: empowering Moroccan Darija speech recognition
Automatic speech recognition (ASR) technology has significantly transformed human-machine interactions, but it remains limited in its representation of diverse languages and dialec...
A Hybrid Lexicon–Transformer Framework for Sentiment, Emotion, and Context Classification in Moroccan Darija (TriLex-Darija)
A Hybrid Lexicon–Transformer Framework for Sentiment, Emotion, and Context Classification in Moroccan Darija (TriLex-Darija)
This paper introduces
TriLex-Darija
, a large-scale affective lexicon suite and a hybrid lexicon–transformer framework for analyzing Morocca...
Exploiting Wikipedia Semantics for Computing Word Associations
Exploiting Wikipedia Semantics for Computing Word Associations
<p><b>Semantic association computation is the process of automatically quantifying the strength of a semantic connection between two textual units based on various lexi...
Wikipedia: a tool to monitor seasonal diseases trends?
Wikipedia: a tool to monitor seasonal diseases trends?
ObjectiveTo explore the interest of Wikipedia as a data source to monitorseasonal diseases trends in metropolitan France.IntroductionToday, Internet, especially Wikipedia, is an im...
The Implications of Spanish-Moroccan Governmental Relations for Moroccan Immigrants in Spain Spanish-Moroccan Governmental Relations and Moroccan Immigrants
The Implications of Spanish-Moroccan Governmental Relations for Moroccan Immigrants in Spain Spanish-Moroccan Governmental Relations and Moroccan Immigrants
AbstractThe terrorist attacks in Madrid on March 11, 2004 were one of the most traumatic events in recent Spanish domestic history, and have had a profound influence in internal po...
Wikipedia in Vascular Surgery Medical Education: Comparative Study (Preprint)
Wikipedia in Vascular Surgery Medical Education: Comparative Study (Preprint)
BACKGROUND
Medical students commonly refer to Wikipedia as their preferred online resource for medical information. The quality and readability of articles ...
Arabic Darija dialect on the YouTube account of Aisha Devia official: A sociolinguistic approach
Arabic Darija dialect on the YouTube account of Aisha Devia official: A sociolinguistic approach
This study aims to explain the factors behind the emergence of the Darija dialect in Morocco and to describe the types of Moroccan dialects, especially on Aisha Devi's Official You...
RISE OF DARIJA IN MOROCCAN DIGITAL ADVERTISING
RISE OF DARIJA IN MOROCCAN DIGITAL ADVERTISING
This article examines the sociolinguistic transformation of advertising in Morocco. It mainly focuses on the integration of Moroccan Arabic (Darija) in influencer marketing. Histor...

