Javascript must be enabled to continue!

Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis

Indonesian is a language with a large number of speakers and diverse vocabulary. One of the main challenges of Indonesian language processing is the presence of agglutinative morphology. This complexity makes it challenging for traditional stemming algorithms developed for European languages to accurately handle Indonesian words. This review focuses on several prominent Indonesian text processing algorithms that have been developed specifically for Indonesian, highlighting the contributions made by Nazief and Adriani, Asian, Arifin and Setiono, and the Enhanced Confix Stripping (ECS) stemmer. By examining these algorithms, we can better understand their methodologies, efficacy, and applications. The results of the study revealed that the ECS stemmer outperformed the other algorithms in terms of accuracy and efficiency. The ECS algorithm was able to strip affixes more effectively and accurately identify the root form of words, leading to improved text analysis and information retrieval. As linguistic technology continues to evolve, ongoing research into these methods will be crucial for advancing our ability to process Indonesian texts accurately and effectively.

Information Technology and Science (ITScience)

Fitrah Rumaisa

Brilliance: Research of Artificial Intelligence

2025

Title: Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis

Description:

Indonesian is a language with a large number of speakers and diverse vocabulary.

One of the main challenges of Indonesian language processing is the presence of agglutinative morphology.

This complexity makes it challenging for traditional stemming algorithms developed for European languages to accurately handle Indonesian words.

This review focuses on several prominent Indonesian text processing algorithms that have been developed specifically for Indonesian, highlighting the contributions made by Nazief and Adriani, Asian, Arifin and Setiono, and the Enhanced Confix Stripping (ECS) stemmer.

By examining these algorithms, we can better understand their methodologies, efficacy, and applications.

The results of the study revealed that the ECS stemmer outperformed the other algorithms in terms of accuracy and efficiency.

The ECS algorithm was able to strip affixes more effectively and accurately identify the root form of words, leading to improved text analysis and information retrieval.

As linguistic technology continues to evolve, ongoing research into these methods will be crucial for advancing our ability to process Indonesian texts accurately and effectively.

Back

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

Primerjalna književnost na prelomu tisočletja

In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...

Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga

The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...

Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach

Converting a word to its original form, is called stemming, which is extremely important in the field of Natural language processing (NLP). It’s an integral part of the linguistic ...

Exploring Language Features of Male and Female Speakers in Pakistani TEDx Talks: A Corpus-based Comparative Analysis

The study explores the linguistic patterns in Pakistani TEDx Talks. It is based on gender-based language use. It consists of ten talks selected from YouTube and applies both quanti...

Abstract The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...

A Rule Based Stemmer

The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of ...

Email:
Password:

Email:

Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis

Related Results