Javascript must be enabled to continue!
Improving the Accuracy of Text Classification using Stemming Method, A Case of Non-formal Indonesian Conversation
View through CrossRef
Abstract
Stemming has long been used in data pre-processing in information retrieval, which aims to make affix words into root words. However, there are not many stemming methods for non-formal Indonesian text processing. The existing stemming method has high accuracy for formal Indonesian, but low for non-formal Indonesian. Thus, the stemming method which has high accuracy for non-formal Indonesian classifier model is still an open-ended challenge. This study introduces a new stemming method to solve problems in the non-formal Indonesian text data pre-processing. Furthermore, this study aims to provide comprehensive research on improving the accuracy of text classifier models by strengthening on stemming method. Using the Support Vector Machine algorithm, a text classifier model is developed, and its accuracy is checked. The experimental evaluation was done by testing 550 datasets in Indonesian using two different stemming methods. The results show that using the proposed stemming method, the text classifier model has higher accuracy than the existing methods with a score of 0.85 and 0.73, respectively. In the future, the proposed stemming method can be used to develop the Indonesian text classifier model which can be used for various purposes including text clustering, summarization, detecting hate speech, and other text processing applications.
Springer Science and Business Media LLC
Title: Improving the Accuracy of Text Classification using Stemming Method, A Case of Non-formal Indonesian Conversation
Description:
Abstract
Stemming has long been used in data pre-processing in information retrieval, which aims to make affix words into root words.
However, there are not many stemming methods for non-formal Indonesian text processing.
The existing stemming method has high accuracy for formal Indonesian, but low for non-formal Indonesian.
Thus, the stemming method which has high accuracy for non-formal Indonesian classifier model is still an open-ended challenge.
This study introduces a new stemming method to solve problems in the non-formal Indonesian text data pre-processing.
Furthermore, this study aims to provide comprehensive research on improving the accuracy of text classifier models by strengthening on stemming method.
Using the Support Vector Machine algorithm, a text classifier model is developed, and its accuracy is checked.
The experimental evaluation was done by testing 550 datasets in Indonesian using two different stemming methods.
The results show that using the proposed stemming method, the text classifier model has higher accuracy than the existing methods with a score of 0.
85 and 0.
73, respectively.
In the future, the proposed stemming method can be used to develop the Indonesian text classifier model which can be used for various purposes including text clustering, summarization, detecting hate speech, and other text processing applications.
Related Results
Funkcije komunikacijski relevantne šutnje u njemačkome
Funkcije komunikacijski relevantne šutnje u njemačkome
Additionally, this chapter presents research of silence with review of main aspects of papers in the field of conversational analysis, ethnography of communication and metaphor of ...
Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation
Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation
Abstract
Background
Stemming has long been used in data pre-processing to retrieve information by tracking affixed words ...
Improving the Accuracy of Text Classification using Stemming Method, A Case of Non-Formal Indonesian Conversation
Improving the Accuracy of Text Classification using Stemming Method, A Case of Non-Formal Indonesian Conversation
Abstract
Background: Stemming has long been used in data pre-processing to retrieve information by tracking affixed words back into their root. In an Indonesian setting, ex...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Abstarct
Introduction
Isolated brain hydatid disease (BHD) is an extremely rare form of echinococcosis. A prompt and timely diagnosis is a crucial step in disease management. This ...
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using d...
Computer-Mediated Chat
Computer-Mediated Chat
The technical apparatus is, then, being made at home with the rest of our world. And that's a thing that's routinely being done, and it's the source of the failure of technocratic ...

