Javascript must be enabled to continue!
Amharic Adhoc Information Retrieval System Based on Morphological Features
View through CrossRef
Information retrieval (IR) is one of the most important research and development areas due to the explosion of digital data and the need of accessing relevant information from huge corpora. Although IR systems function well for technologically advanced languages such as English, this is not the case for morphologically complex, under-resourced and less-studied languages such as Amharic. Amharic is a Semitic language characterized by a complex morphology where thousands of words are generated from a single root form through inflection and derivation. This has made the development of Amharic natural language processing (NLP) tools a challenging task. Amharic adhoc retrieval also faces challenges due to scarcity of linguistic resources, tools and standard evaluation corpora. In this research work, we investigate the impact of morphological features on the representation of Amharic documents and queries for adhoc retrieval. We also analyze the effects of stem-based and root-based text representation, and proposed new Amharic IR system architecture. Moreover, we present the resources and corpora we constructed for evaluation of Amharic IR systems and other NLP tools. We conduct various experiments with a TREC-like approach for Amharic IR test collection using a standard evaluation framework and measures. Our findings show that root-based text representation outperforms the conventional stem-based representation on Amharic IR.
Title: Amharic Adhoc Information Retrieval System Based on Morphological Features
Description:
Information retrieval (IR) is one of the most important research and development areas due to the explosion of digital data and the need of accessing relevant information from huge corpora.
Although IR systems function well for technologically advanced languages such as English, this is not the case for morphologically complex, under-resourced and less-studied languages such as Amharic.
Amharic is a Semitic language characterized by a complex morphology where thousands of words are generated from a single root form through inflection and derivation.
This has made the development of Amharic natural language processing (NLP) tools a challenging task.
Amharic adhoc retrieval also faces challenges due to scarcity of linguistic resources, tools and standard evaluation corpora.
In this research work, we investigate the impact of morphological features on the representation of Amharic documents and queries for adhoc retrieval.
We also analyze the effects of stem-based and root-based text representation, and proposed new Amharic IR system architecture.
Moreover, we present the resources and corpora we constructed for evaluation of Amharic IR systems and other NLP tools.
We conduct various experiments with a TREC-like approach for Amharic IR test collection using a standard evaluation framework and measures.
Our findings show that root-based text representation outperforms the conventional stem-based representation on Amharic IR.
Related Results
Developing an audio search engine for Amharic speech web resources
Developing an audio search engine for Amharic speech web resources
Abstract
While general-purpose search engines primarily serve English-language content, the web has seen enormous growth in non-resource-rich languages like Amhar...
Developing Amharic Sign Language Recognition Model for Amharic Characters Using Deep Learning Approach
Developing Amharic Sign Language Recognition Model for Amharic Characters Using Deep Learning Approach
Abstract
Hearing-impaired people use Sign Language to communicate with each other as well as with other communities. Usually, they are unable to communicate with normal peo...
Translation, reliability, and validity of Amharic versions of the Pelvic Floor Distress Inventory (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7)
Translation, reliability, and validity of Amharic versions of the Pelvic Floor Distress Inventory (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7)
Purpose
Pelvic Floor Disorders (PFDs) affects many women and have a significant impact on their quality of life. Pelvic Floor Impact Questionnaire (PFIQ-7) and ...
Translation, reliability, and validity of Amharic versions of the Pelvic Floor Distress Inventory (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7)
Translation, reliability, and validity of Amharic versions of the Pelvic Floor Distress Inventory (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7)
Abstract
Purpose
Pelvic Floor Disorders (PFDs) affects many women and have a significant impact on their quality of life. Pelvi...
Coreference Resolution for Amharic Text using Bidirectional Encoder Representation from Transformer (BERT)
Coreference Resolution for Amharic Text using Bidirectional Encoder Representation from Transformer (BERT)
Abstract
Coreference resolution is the process of finding an entity which is refers to the same entity in a text. In coreference resolution similar entities are men...
PRACTICALITY OF ALTERNATIVE ASSESSMENTS: FROM AMHARIC LANGUAGE INSTRUCTORS’ VIEW POINTS
PRACTICALITY OF ALTERNATIVE ASSESSMENTS: FROM AMHARIC LANGUAGE INSTRUCTORS’ VIEW POINTS
The purpose of this study was examining the practicality of Alternative Assessment in Ethiopian higher education Amharic Language educational context. The study also, endeavors to ...
syntax of Amharic ideophones
syntax of Amharic ideophones
This study is on Amharic ideophones, a subject that has not been described well in the syntax of Amharic. The data used for the analysis are collected from natural settings of the ...
Evaluating the validity of the Amharic Brief Pain Inventory among people with chronic primary musculoskeletal pain in Ethiopia
Evaluating the validity of the Amharic Brief Pain Inventory among people with chronic primary musculoskeletal pain in Ethiopia
Abstract
Background
The Brief Pain Inventory (BPI) is a multidimensional pain assessment tool used to evaluate pain severity and pain interference. ...

