Javascript must be enabled to continue!
Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis
View through CrossRef
Indonesian is a language with a large number of speakers and diverse vocabulary. One of the main challenges of Indonesian language processing is the presence of agglutinative morphology. This complexity makes it challenging for traditional stemming algorithms developed for European languages to accurately handle Indonesian words. This review focuses on several prominent Indonesian text processing algorithms that have been developed specifically for Indonesian, highlighting the contributions made by Nazief and Adriani, Asian, Arifin and Setiono, and the Enhanced Confix Stripping (ECS) stemmer. By examining these algorithms, we can better understand their methodologies, efficacy, and applications. The results of the study revealed that the ECS stemmer outperformed the other algorithms in terms of accuracy and efficiency. The ECS algorithm was able to strip affixes more effectively and accurately identify the root form of words, leading to improved text analysis and information retrieval. As linguistic technology continues to evolve, ongoing research into these methods will be crucial for advancing our ability to process Indonesian texts accurately and effectively.
Information Technology and Science (ITScience)
Title: Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis
Description:
Indonesian is a language with a large number of speakers and diverse vocabulary.
One of the main challenges of Indonesian language processing is the presence of agglutinative morphology.
This complexity makes it challenging for traditional stemming algorithms developed for European languages to accurately handle Indonesian words.
This review focuses on several prominent Indonesian text processing algorithms that have been developed specifically for Indonesian, highlighting the contributions made by Nazief and Adriani, Asian, Arifin and Setiono, and the Enhanced Confix Stripping (ECS) stemmer.
By examining these algorithms, we can better understand their methodologies, efficacy, and applications.
The results of the study revealed that the ECS stemmer outperformed the other algorithms in terms of accuracy and efficiency.
The ECS algorithm was able to strip affixes more effectively and accurately identify the root form of words, leading to improved text analysis and information retrieval.
As linguistic technology continues to evolve, ongoing research into these methods will be crucial for advancing our ability to process Indonesian texts accurately and effectively.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
Converting a word to its original form, is called stemming, which is extremely important in the field of Natural language processing (NLP). It’s an integral part of the linguistic ...
Exploring Language Features of Male and Female Speakers in Pakistani TEDx Talks: A Corpus-based Comparative Analysis
Exploring Language Features of Male and Female Speakers in Pakistani TEDx Talks: A Corpus-based Comparative Analysis
The study explores the linguistic patterns in Pakistani TEDx Talks. It is based on gender-based language use. It consists of ten talks selected from YouTube and applies both quanti...
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Abstract
The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...
A Rule Based Stemmer
A Rule Based Stemmer
The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of ...

