Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Rule Based Stemmer

View through CrossRef
The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of web content mining which is a subpart of web mining has gained momentum in the field of research. It analyses the opinion of variety of people all over the world. Sentiment Analysis encompasses preprocessing, feature selection, classification and sentiment prediction. Preprocessing is an important process and it deals with many techniques. Stop word removal, punctuation removal, conversion of numbers to number names are some of the basic techniques. Stemming is yet another important preprocessing technique that reduces the different words form to its root. There are basically three types of stemmers namely truncating, statistical and hybrid. The aim of this paper is to propose a rule based stemmer that is a truncating stemmer. It deals with rules for truncation and replacement. The data given as input passes through a series of rules. If the condition specified gets satisfied then the associated rule gets executed otherwise the input is checked with the next rule and the process continues further. The result of execution is stemmed words. The performance of the proposed rule based stemmer is compared with the existing stemmers under the same rule based category namely Porter and Lancaster. Various metrics have been used for evaluation. The observations reveal the fact that the proposed stemmer out performs the Porter and Lancaster stemmers in terms of correctly stemmed words factor and shows a good average conflation factor and lesser over stemming and under stemming errors.
Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP
Title: A Rule Based Stemmer
Description:
The present digital world generates enormous amount of data instantaneously.
The need to effectively mine knowledge seems to be the need of the hour.
Sentiment Analysis, a part of web content mining which is a subpart of web mining has gained momentum in the field of research.
It analyses the opinion of variety of people all over the world.
Sentiment Analysis encompasses preprocessing, feature selection, classification and sentiment prediction.
Preprocessing is an important process and it deals with many techniques.
Stop word removal, punctuation removal, conversion of numbers to number names are some of the basic techniques.
Stemming is yet another important preprocessing technique that reduces the different words form to its root.
There are basically three types of stemmers namely truncating, statistical and hybrid.
The aim of this paper is to propose a rule based stemmer that is a truncating stemmer.
It deals with rules for truncation and replacement.
The data given as input passes through a series of rules.
If the condition specified gets satisfied then the associated rule gets executed otherwise the input is checked with the next rule and the process continues further.
The result of execution is stemmed words.
The performance of the proposed rule based stemmer is compared with the existing stemmers under the same rule based category namely Porter and Lancaster.
Various metrics have been used for evaluation.
The observations reveal the fact that the proposed stemmer out performs the Porter and Lancaster stemmers in terms of correctly stemmed words factor and shows a good average conflation factor and lesser over stemming and under stemming errors.

Related Results

Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
Converting a word to its original form, is called stemming, which is extremely important in the field of Natural language processing (NLP). It’s an integral part of the linguistic ...
An International Rule of Law
An International Rule of Law
The “international rule of law” is an elusive concept. Under this heading, mainly two variations are being discussed: The international rule of law “proper” and an “internationaliz...
Memorization Techniques for Peter Chew Rule
Memorization Techniques for Peter Chew Rule
We normally use sine rule to find the opposite side angle given when we are given two angles and one side. We also can use sine rules to find non-included angle when we are given t...
Application Peter Chew Rule in Calculator Design
Application Peter Chew Rule in Calculator Design
Mathematics has always been a challenging topic for high school and university students around the world. Literature revealed that use of technological tool have had a major i...
The FDA’s Proposed Rule on Laboratory-Developed Tests: Impacts on Clinical Laboratory Testing and Patient Care
The FDA’s Proposed Rule on Laboratory-Developed Tests: Impacts on Clinical Laboratory Testing and Patient Care
ABSTRACT In October 2023, the U.S. Food and Drug Administration (FDA) released a proposed rule to regulate laboratory-developed tests (LDTs) as medical devices. Whi...
The SEC's Shareholder Proposal Rule
The SEC's Shareholder Proposal Rule
In this Article, we take advantage of this Symposium’s goals to think broadly about the future of Rule 14a-8 of the Securities Exchange Act of 1934, the shareholder proposal rule. ...
Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis
Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis
Indonesian is a language with a large number of speakers and diverse vocabulary. One of the main challenges of Indonesian language processing is the presence of agglutinative morph...
Children extract a new linguistic rule more quickly than adults
Children extract a new linguistic rule more quickly than adults
AbstractChildren achieve better long‐term language outcomes than adults. However, it remains unclear whether children actually learn language more quickly than adults during real‐t...

Back to Top