Javascript must be enabled to continue!
Annotated Lexicon for Sentiment Analysis in the Bosnian Language
View through CrossRef
The paper presents the first sentiment-annotated lexicon of the Bosnian language. The annotation process and methodology are presented along with a usability study, which concentrates on language coverage. The composition of the starting base was done by translating the Slovenian annotated lexicon and later manually checking the translations and annotations. The language coverage was observed using two reference corpora. The Bosnian language is still considered a low-resource language. A reference corpus comprised of automatically crawled web pages is available for the Bosnian language, but the authors had a hard time sourcing any corpora with a clear time frame for the text contained therein. A corpus of contemporary texts was constructed by collecting news articles from several Bosnian web portals. Two language coverage methods were used in this experiment. The first used a frequency list of all words extracted from two reference Bosnian language corpora, and the second ignored the frequencies as the main factor in counting. The computed coverage using the first presented method for the first corpus was 19.24%, while the second corpus yielded 28.05%. The second method yielded 2.34% coverage for the first corpus and 6.98% for the second corpus. The results of the study present a language coverage that is comparable to the state of the art in the field. The usability of the lexicon was already proven in a Twitter-based comparison.
University of Ljubljana
Title: Annotated Lexicon for Sentiment Analysis in the Bosnian Language
Description:
The paper presents the first sentiment-annotated lexicon of the Bosnian language.
The annotation process and methodology are presented along with a usability study, which concentrates on language coverage.
The composition of the starting base was done by translating the Slovenian annotated lexicon and later manually checking the translations and annotations.
The language coverage was observed using two reference corpora.
The Bosnian language is still considered a low-resource language.
A reference corpus comprised of automatically crawled web pages is available for the Bosnian language, but the authors had a hard time sourcing any corpora with a clear time frame for the text contained therein.
A corpus of contemporary texts was constructed by collecting news articles from several Bosnian web portals.
Two language coverage methods were used in this experiment.
The first used a frequency list of all words extracted from two reference Bosnian language corpora, and the second ignored the frequencies as the main factor in counting.
The computed coverage using the first presented method for the first corpus was 19.
24%, while the second corpus yielded 28.
05%.
The second method yielded 2.
34% coverage for the first corpus and 6.
98% for the second corpus.
The results of the study present a language coverage that is comparable to the state of the art in the field.
The usability of the lexicon was already proven in a Twitter-based comparison.
Related Results
Ekonomika bosanskih velikaša u 14. i 15. stoljeću
Ekonomika bosanskih velikaša u 14. i 15. stoljeću
The role and significance of the Bosnian nobility in the historical currents of medieval Bosnia can be reliably traced in the 14th and 15th centuries when various socio-political f...
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Sentiment Analysis with Python: A Hands-on Approach
Sentiment Analysis with Python: A Hands-on Approach
Sentiment Analysis is a rapidly growing field in Natural Language Processing (NLP) that aims to extract opinions, emotions, and attitudes expressed in text. It has a wide range o...
William Colenso’s Māori-English Lexicon
William Colenso’s Māori-English Lexicon
<p>William Colenso, one of Victorian New Zealand’s most accomplished polymaths, is remembered best as a printer, a defrocked missionary, botanist, and politician. Up till now...
REFLECTING THE ATTITUDES ABOUT THE SCHOLARLY CONTRIBUTION OF ACADEMICIAN VOJISLAV P. NIKČEVIĆ
REFLECTING THE ATTITUDES ABOUT THE SCHOLARLY CONTRIBUTION OF ACADEMICIAN VOJISLAV P. NIKČEVIĆ
The modern meaning of linguistic and literal science in Montenegro comes from the pioneer’s works of academic Vojislav P. Nikcevic, who made in period from 1965. to 2007., not only...
Lexicon-based sentiment analysis for stock movement prediction
Lexicon-based sentiment analysis for stock movement prediction
Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon...
WITHDRAWN: ChatGptTweets Analyses Based On AI
WITHDRAWN: ChatGptTweets Analyses Based On AI
Abstract
Sentiment analysis plays a crucial role in understanding public opinions and attitudes. In this study, we address the sentiment analysis of ChatGPT tweets, leverag...
SA-MAIS: Hybrid automatic sentiment analyser for stock market
SA-MAIS: Hybrid automatic sentiment analyser for stock market
Sentiment analysis of stock-related tweets is a challenging task, not only due to the specificity of the domain but also because of the short nature of the texts. This work propose...

