Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Rule Based Stemmer

View through CrossRef
The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of web content mining which is a subpart of web mining has gained momentum in the field of research. It analyses the opinion of variety of people all over the world. Sentiment Analysis encompasses preprocessing, feature selection, classification and sentiment prediction. Preprocessing is an important process and it deals with many techniques. Stop word removal, punctuation removal, conversion of numbers to number names are some of the basic techniques. Stemming is yet another important preprocessing technique that reduces the different words form to its root. There are basically three types of stemmers namely truncating, statistical and hybrid. The aim of this paper is to propose a rule based stemmer that is a truncating stemmer. It deals with rules for truncation and replacement. The data given as input passes through a series of rules. If the condition specified gets satisfied then the associated rule gets executed otherwise the input is checked with the next rule and the process continues further. The result of execution is stemmed words. The performance of the proposed rule based stemmer is compared with the existing stemmers under the same rule based category namely Porter and Lancaster. Various metrics have been used for evaluation. The observations reveal the fact that the proposed stemmer out performs the Porter and Lancaster stemmers in terms of correctly stemmed words factor and shows a good average conflation factor and lesser over stemming and under stemming errors.
Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP
Title: A Rule Based Stemmer
Description:
The present digital world generates enormous amount of data instantaneously.
The need to effectively mine knowledge seems to be the need of the hour.
Sentiment Analysis, a part of web content mining which is a subpart of web mining has gained momentum in the field of research.
It analyses the opinion of variety of people all over the world.
Sentiment Analysis encompasses preprocessing, feature selection, classification and sentiment prediction.
Preprocessing is an important process and it deals with many techniques.
Stop word removal, punctuation removal, conversion of numbers to number names are some of the basic techniques.
Stemming is yet another important preprocessing technique that reduces the different words form to its root.
There are basically three types of stemmers namely truncating, statistical and hybrid.
The aim of this paper is to propose a rule based stemmer that is a truncating stemmer.
It deals with rules for truncation and replacement.
The data given as input passes through a series of rules.
If the condition specified gets satisfied then the associated rule gets executed otherwise the input is checked with the next rule and the process continues further.
The result of execution is stemmed words.
The performance of the proposed rule based stemmer is compared with the existing stemmers under the same rule based category namely Porter and Lancaster.
Various metrics have been used for evaluation.
The observations reveal the fact that the proposed stemmer out performs the Porter and Lancaster stemmers in terms of correctly stemmed words factor and shows a good average conflation factor and lesser over stemming and under stemming errors.

Related Results

An International Rule of Law
An International Rule of Law
The “international rule of law” is an elusive concept. Under this heading, mainly two variations are being discussed: The international rule of law “proper” and an “internationaliz...
Tracing the Evolving Scope of the Rule of Reason and the Per Se Rule
Tracing the Evolving Scope of the Rule of Reason and the Per Se Rule
Analysis of alleged antitrust violations in the United States is conducted by generally using one of two rules of decision. Under the rule of reason, the presumptive mode of analys...
Public interest and good governance in the rule of law aspect
Public interest and good governance in the rule of law aspect
The purpose of the article is to elucidate the relationship between good governance, the public interest and the rule of law, given the declining tendency of the rule of law indice...
Rousseau and the Qualified Support of Matriarchal Rule
Rousseau and the Qualified Support of Matriarchal Rule
The article investigates the relations between men and women in Rousseau’s major works to uncover the possibility of a long-term rule of women over men. Rousseau does provide examp...
Shifting meanings of fazhi and China’s journey toward socialist rule of law
Shifting meanings of fazhi and China’s journey toward socialist rule of law
Abstract Scholars disagree over the nature, extent, and direction of the rule of law in China. Unlike researchers who see China’s rule of law situation as either sta...
Rule Markup Languages and Semantic Web Rule Languages
Rule Markup Languages and Semantic Web Rule Languages
Rule markup languages will be the vehicle for using rules on the Web and in other distributed systems. They allow publishing, deploying, executing and communicating rules in a netw...
Impulse noise: Comparison of dose calculated by 5-dB rule and 3-dB rule
Impulse noise: Comparison of dose calculated by 5-dB rule and 3-dB rule
In the past several years, there have been many proposals concerning incorporation of impulse noise into total worker exposure. This is a problem of particular concern in the Unite...
National and International Rule of Law: Proclaimed Adherence and Real-Life Policies
National and International Rule of Law: Proclaimed Adherence and Real-Life Policies
The main purpose of this article is to analyze the correlation between ‘black letter’ law and its real-life implementation. The correlation is to be examined in light of countries’...

Back to Top