Javascript must be enabled to continue!
Identification of Review Helpfulness Using Novel Textual and Language-Context Features
View through CrossRef
With the increase in users of social media websites such as IMDb, a movie website, and the rise of publicly available data, opinion mining is more accessible than ever. In the research field of language understanding, categorization of movie reviews can be challenging because human language is complex, leading to scenarios where connotation words exist. Connotation words have a different meaning than their literal meanings. While representing a word, the context in which the word is used changes the semantics of words. In this research work, categorizing movie reviews with good F-Measure scores has been investigated with Word2Vec and three different aspects of proposed features have been inspected. First, psychological features are extracted from reviews positive emotion, negative emotion, anger, sadness, clout (confidence level) and dictionary words. Second, readablility features are extracted; the Automated Readability Index (ARI), the Coleman Liau Index (CLI) and Word Count (WC) are calculated to measure the review’s understandability score and their impact on review classification performance is measured. Lastly, linguistic features are also extracted from reviews adjectives and adverbs. The Word2Vec model is trained on collecting 50,000 reviews related to movies. A self-trained Word2Vec model is used for the contextualized embedding of words into vectors with 50, 100, 150 and 300 dimensions.The pretrained Word2Vec model converts words into vectors with 150 and 300 dimensions. Traditional and advanced machine-learning (ML) algorithms are applied and evaluated according to performance measures: accuracy, precision, recall and F-Measure. The results indicate Support Vector Machine (SVM) using self-trained Word2Vec achieved 86% F-Measure and using psychological, linguistic and readability features with concatenation of Word2Vec features SVM achieved 87.93% F-Measure.
Title: Identification of Review Helpfulness Using Novel Textual and Language-Context Features
Description:
With the increase in users of social media websites such as IMDb, a movie website, and the rise of publicly available data, opinion mining is more accessible than ever.
In the research field of language understanding, categorization of movie reviews can be challenging because human language is complex, leading to scenarios where connotation words exist.
Connotation words have a different meaning than their literal meanings.
While representing a word, the context in which the word is used changes the semantics of words.
In this research work, categorizing movie reviews with good F-Measure scores has been investigated with Word2Vec and three different aspects of proposed features have been inspected.
First, psychological features are extracted from reviews positive emotion, negative emotion, anger, sadness, clout (confidence level) and dictionary words.
Second, readablility features are extracted; the Automated Readability Index (ARI), the Coleman Liau Index (CLI) and Word Count (WC) are calculated to measure the review’s understandability score and their impact on review classification performance is measured.
Lastly, linguistic features are also extracted from reviews adjectives and adverbs.
The Word2Vec model is trained on collecting 50,000 reviews related to movies.
A self-trained Word2Vec model is used for the contextualized embedding of words into vectors with 50, 100, 150 and 300 dimensions.
The pretrained Word2Vec model converts words into vectors with 150 and 300 dimensions.
Traditional and advanced machine-learning (ML) algorithms are applied and evaluated according to performance measures: accuracy, precision, recall and F-Measure.
The results indicate Support Vector Machine (SVM) using self-trained Word2Vec achieved 86% F-Measure and using psychological, linguistic and readability features with concatenation of Word2Vec features SVM achieved 87.
93% F-Measure.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
How reviewer level affects review helpfulness and reviewing behavior across hotel classifications: the case of Seoul in Korea
How reviewer level affects review helpfulness and reviewing behavior across hotel classifications: the case of Seoul in Korea
PurposeThe purpose of this study is to explore the effect of reviewer qualification and credibility (RQC) and hotel classification involving online hotel reviews (OHRs). The study ...
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
<span lang="EN-MY">In this paper, a wideband millimeter-wave (mm-Wave) printed dipole antenna is proposed to be used for fifth generation (5G) communications. The single elem...
Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...
Rodnoosjetljiv jezik na primjeru njemačkih časopisa Brigitte i Der Spiegel
Rodnoosjetljiv jezik na primjeru njemačkih časopisa Brigitte i Der Spiegel
On the basis of the comparative analysis of texts of the German biweekly magazine Brigitte and the weekly magazine Der Spiegel and under the presumption that gender-sensitive langu...
REFLECTING THE ATTITUDES ABOUT THE SCHOLARLY CONTRIBUTION OF ACADEMICIAN VOJISLAV P. NIKČEVIĆ
REFLECTING THE ATTITUDES ABOUT THE SCHOLARLY CONTRIBUTION OF ACADEMICIAN VOJISLAV P. NIKČEVIĆ
The modern meaning of linguistic and literal science in Montenegro comes from the pioneer’s works of academic Vojislav P. Nikcevic, who made in period from 1965. to 2007., not only...
De-identifying government datasets:
De-identifying government datasets:
De-identification is a general term for any process of removing the association between a set of identifying data and the data subject. This document describes the use of de-identi...

