Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Fusion of Fast-text and Indo-Wordnet for Disambiguation of Word Sense in the Marathi Language

View through CrossRef
This research employs the combination of the FastText model and Indo-WordNet to address the issue of word sense disambiguation (WSD) in Marathi literature. The initial iteration of the algorithm employed word pair matching as the technique to ascertain the presence of overlap between the items in the "context bag" and the "sense bag" derived from the lexical resource WordNet. The current methodology involves the computation of overlap by utilizing a semantic similarity metric that leverages fastText subword embeddings. This approach demonstrates proficiency in effectively managing unanticipated word formations, while simultaneously elucidating the inherent semantics of the terms. Significant progress has been achieved in the field of Word Sense Disambiguation (WSD) for both the English language and many European languages. There is a substantial challenge to be surmounted in relation to Marathi and other languages spoken in India. The Marathi text corpus, sourced from the government of India, comprises a vast assemblage of Marathi sentences. The dataset used in this study consisted of the Indo WordNet for the Marathi language and the Marathi Online Dictionary. The results of the conducted experiments demonstrate promising discoveries. The target words that possess semantically distinct synsets in WordNet are assigned a high F1 score. The achieved F1 score of 89% above the baseline and signifies substantial advancements in compared to previous knowledge-based methodologies employed for low resource Indian languages.
Title: Fusion of Fast-text and Indo-Wordnet for Disambiguation of Word Sense in the Marathi Language
Description:
This research employs the combination of the FastText model and Indo-WordNet to address the issue of word sense disambiguation (WSD) in Marathi literature.
The initial iteration of the algorithm employed word pair matching as the technique to ascertain the presence of overlap between the items in the "context bag" and the "sense bag" derived from the lexical resource WordNet.
The current methodology involves the computation of overlap by utilizing a semantic similarity metric that leverages fastText subword embeddings.
This approach demonstrates proficiency in effectively managing unanticipated word formations, while simultaneously elucidating the inherent semantics of the terms.
Significant progress has been achieved in the field of Word Sense Disambiguation (WSD) for both the English language and many European languages.
There is a substantial challenge to be surmounted in relation to Marathi and other languages spoken in India.
The Marathi text corpus, sourced from the government of India, comprises a vast assemblage of Marathi sentences.
The dataset used in this study consisted of the Indo WordNet for the Marathi language and the Marathi Online Dictionary.
The results of the conducted experiments demonstrate promising discoveries.
The target words that possess semantically distinct synsets in WordNet are assigned a high F1 score.
The achieved F1 score of 89% above the baseline and signifies substantial advancements in compared to previous knowledge-based methodologies employed for low resource Indian languages.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
The Nuclear Fusion Award
The Nuclear Fusion Award
The Nuclear Fusion Award ceremony for 2009 and 2010 award winners was held during the 23rd IAEA Fusion Energy Conference in Daejeon. This time, both 2009 and 2010 award winners w...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
Semi-Supervised Word Sense Disambiguation via Context Weighting
Semi-Supervised Word Sense Disambiguation via Context Weighting
Word sense disambiguation as a central research topic in natural language processing can promote the development of many applications such as information retrieval, speech synthesi...
INDONESIA'S DIGITAL DIPLOMACY STRATEGY TOWARD THE INDO-PACIFIC REGION
INDONESIA'S DIGITAL DIPLOMACY STRATEGY TOWARD THE INDO-PACIFIC REGION
Abstrak Indo-Pasifik ialah konsep geografi yang meliputi kawasan Lautan Hindi dan Lautan Pasifik. Kawasan yang luas di rantau Indo-Pasifik dan ramai pelakon yang terlibat telah men...
Nonproliferation and fusion power plants
Nonproliferation and fusion power plants
Abstract The world now appears to be on the brink of realizing commercial fusion. As fusion energy progresses towards near-term commercial deployment, the question arises a...
Rodnoosjetljiv jezik na primjeru njemačkih časopisa Brigitte i Der Spiegel
Rodnoosjetljiv jezik na primjeru njemačkih časopisa Brigitte i Der Spiegel
On the basis of the comparative analysis of texts of the German biweekly magazine Brigitte and the weekly magazine Der Spiegel and under the presumption that gender-sensitive langu...
Leveraging Large Language Models to Build a Cutting-Edge French Word Sense Disambiguation Corpus
Leveraging Large Language Models to Build a Cutting-Edge French Word Sense Disambiguation Corpus
Abstract With the increasing amount of data circulating over the Web, there is a growing need to develop and deploy tools aimed at unraveling semantic nuances within text o...

Back to Top