Javascript must be enabled to continue!
Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
View through CrossRef
Converting a word to its original form, is called stemming, which is extremely important in the field of Natural language processing (NLP). It’s an integral part of the linguistic pre-processing of every Natural language processing application. Stemming converts inflectional word forms into their root word. Much work has been done for stemming in different national and regional languages like English, French, Arabic, German, Urdu, and Hindi. Many regional languages still need work to build digital resources using Natural language processing. Saraiki is one of the widely spoken regional languages in Pakistan. Almost eighty million people use this language for communication. There are very limited digital resources using the Saraiki language available to support advancement in Natural language processing technologies. The current research aims to propose a hybrid stemmer to stem Saraiki Work. The hybrid stemmer contains two hundred prefix and postfix rules and Long short-term memory based sequence-to-sequence model for converting Saraiki words into the stem. Firstly, Saraiki text * Corresponding Author: mubasher@isp.edu.pk was pre-processed, and a rule set was implemented. Secondly, the Long short-term memory based sequence-to-sequence model was deployed to stem the Saraiki word correctly. In the last step, The Saraiki Stemmer performance was evaluated by accurately finding stem word accuracy using a rule-set and Long short-term memory sequence to sequence model. After experiments, using the rule set correctly, stem word accuracy was 68.53%, while the Long short-term memory based sequence-to-sequence model produced 93.0% accuracy of correctly stem words. This work contributes significantly to the regional linguistic field by introducing stemmer for the Saraiki language.
University of Management and Technology
Title: Saraiki Language Hybrid Stemmer Using Rule-Based and LSTM-Based Sequence-To-Sequence Model Approach
Description:
Converting a word to its original form, is called stemming, which is extremely important in the field of Natural language processing (NLP).
It’s an integral part of the linguistic pre-processing of every Natural language processing application.
Stemming converts inflectional word forms into their root word.
Much work has been done for stemming in different national and regional languages like English, French, Arabic, German, Urdu, and Hindi.
Many regional languages still need work to build digital resources using Natural language processing.
Saraiki is one of the widely spoken regional languages in Pakistan.
Almost eighty million people use this language for communication.
There are very limited digital resources using the Saraiki language available to support advancement in Natural language processing technologies.
The current research aims to propose a hybrid stemmer to stem Saraiki Work.
The hybrid stemmer contains two hundred prefix and postfix rules and Long short-term memory based sequence-to-sequence model for converting Saraiki words into the stem.
Firstly, Saraiki text * Corresponding Author: mubasher@isp.
edu.
pk was pre-processed, and a rule set was implemented.
Secondly, the Long short-term memory based sequence-to-sequence model was deployed to stem the Saraiki word correctly.
In the last step, The Saraiki Stemmer performance was evaluated by accurately finding stem word accuracy using a rule-set and Long short-term memory sequence to sequence model.
After experiments, using the rule set correctly, stem word accuracy was 68.
53%, while the Long short-term memory based sequence-to-sequence model produced 93.
0% accuracy of correctly stem words.
This work contributes significantly to the regional linguistic field by introducing stemmer for the Saraiki language.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
The Critical Analysis of Saraiki Literature; Factors, Trends and Traces
The Critical Analysis of Saraiki Literature; Factors, Trends and Traces
The Saraiki language, rooted in the ancient Indus Valley civilization, presents a rich tapestry of historical, linguistic, and cultural evolution. Originating in a region encompass...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Growth and Development of Saraiki Novel
Growth and Development of Saraiki Novel
The linguistic developments have always been coincideing with the development of human civilizations. It is, therefore, commonly believed among historians and linguistic scholars t...
The Representation of Social Reality in Saraiki Dramas Roshan Zameer and Qatil e Hamsheer
The Representation of Social Reality in Saraiki Dramas Roshan Zameer and Qatil e Hamsheer
The Genre of Drama had always been reflective of social life. The history of drama is as old as of humans on earth. Saraiki drama is believed to be developed from undeveloped but o...
A Comparative Morphological and Computational Analysis of Saraiki and Urdu Verbs
A Comparative Morphological and Computational Analysis of Saraiki and Urdu Verbs
This paper presents a comprehensive comparative analysis of verb morphology in Saraiki and Urdu, two prominent Indo-Aryan languages of Pakistan. The researchers combine two researc...
A Rule Based Stemmer
A Rule Based Stemmer
The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of ...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...

