Javascript must be enabled to continue!

Fusion of Fast-text and Indo-Wordnet for Disambiguation of Word Sense in the Marathi Language

This research employs the combination of the FastText model and Indo-WordNet to address the issue of word sense disambiguation (WSD) in Marathi literature. The initial iteration of the algorithm employed word pair matching as the technique to ascertain the presence of overlap between the items in the "context bag" and the "sense bag" derived from the lexical resource WordNet. The current methodology involves the computation of overlap by utilizing a semantic similarity metric that leverages fastText subword embeddings. This approach demonstrates proficiency in effectively managing unanticipated word formations, while simultaneously elucidating the inherent semantics of the terms. Significant progress has been achieved in the field of Word Sense Disambiguation (WSD) for both the English language and many European languages. There is a substantial challenge to be surmounted in relation to Marathi and other languages spoken in India. The Marathi text corpus, sourced from the government of India, comprises a vast assemblage of Marathi sentences. The dataset used in this study consisted of the Indo WordNet for the Marathi language and the Marathi Online Dictionary. The results of the conducted experiments demonstrate promising discoveries. The target words that possess semantically distinct synsets in WordNet are assigned a high F1 score. The achieved F1 score of 89% above the baseline and signifies substantial advancements in compared to previous knowledge-based methodologies employed for low resource Indian languages.

Technoscience Academy

Mr. Aparitosh Gahankari Dr. Avinash S. Kapse Dr. Mohammad Atique Dr. V.M. Thakare Dr. Arvind S. Kapse

International Journal of Scientific Research in Science and Technology

2025

Title: Fusion of Fast-text and Indo-Wordnet for Disambiguation of Word Sense in the Marathi Language

Description:

This research employs the combination of the FastText model and Indo-WordNet to address the issue of word sense disambiguation (WSD) in Marathi literature.

The initial iteration of the algorithm employed word pair matching as the technique to ascertain the presence of overlap between the items in the "context bag" and the "sense bag" derived from the lexical resource WordNet.

The current methodology involves the computation of overlap by utilizing a semantic similarity metric that leverages fastText subword embeddings.

This approach demonstrates proficiency in effectively managing unanticipated word formations, while simultaneously elucidating the inherent semantics of the terms.

Significant progress has been achieved in the field of Word Sense Disambiguation (WSD) for both the English language and many European languages.

There is a substantial challenge to be surmounted in relation to Marathi and other languages spoken in India.

The Marathi text corpus, sourced from the government of India, comprises a vast assemblage of Marathi sentences.

The dataset used in this study consisted of the Indo WordNet for the Marathi language and the Marathi Online Dictionary.

The results of the conducted experiments demonstrate promising discoveries.

The target words that possess semantically distinct synsets in WordNet are assigned a high F1 score.

The achieved F1 score of 89% above the baseline and signifies substantial advancements in compared to previous knowledge-based methodologies employed for low resource Indian languages.

Back

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

The Nuclear Fusion Award

The Nuclear Fusion Award ceremony for 2009 and 2010 award winners was held during the 23rd IAEA Fusion Energy Conference in Daejeon. This time, both 2009 and 2010 award winners w...

E-Press and Oppress

From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...

Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Semi-Supervised Word Sense Disambiguation via Context Weighting

Word sense disambiguation as a central research topic in natural language processing can promote the development of many applications such as information retrieval, speech synthesi...

INDONESIA'S DIGITAL DIPLOMACY STRATEGY TOWARD THE INDO-PACIFIC REGION

Abstrak Indo-Pasifik ialah konsep geografi yang meliputi kawasan Lautan Hindi dan Lautan Pasifik. Kawasan yang luas di rantau Indo-Pasifik dan ramai pelakon yang terlibat telah men...

Nonproliferation and fusion power plants

Abstract The world now appears to be on the brink of realizing commercial fusion. As fusion energy progresses towards near-term commercial deployment, the question arises a...

Rodnoosjetljiv jezik na primjeru njemačkih časopisa Brigitte i Der Spiegel

On the basis of the comparative analysis of texts of the German biweekly magazine Brigitte and the weekly magazine Der Spiegel and under the presumption that gender-sensitive langu...

Email:
Password:

Email:

Fusion of Fast-text and Indo-Wordnet for Disambiguation of Word Sense in the Marathi Language

Related Results