Javascript must be enabled to continue!
A Comparative Study of Some Selected Classifiers on an Imbalanced Dataset for Sentiment Analysis
View through CrossRef
Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis. For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions. This research is to evaluate various machine learning algorithms for online dataset classification, where same set of data will be used to test four different machine learning algorithms: Naive Bayes, Support Vector machine, K-nearest neighbor and Decision tree. In order to determine which machine learning model will perform best in sentiment analysis as a constant issue. In this research, our primary goal is to identify the most effective machine learning model for sentiment analysis of English texts among the aforementioned classifiers. Their robustness will be tested and classified with an imbalanced dataset Kaggle.com a Machine learning repository. The dataset will first undergo data preprocessing in order to enable analysis, and then feature extraction for the base classifiers performance and accuracy which will be carried out in Jupyter notebook from Anaconda. Each machine learning algorithm performance scores will be calculated for higher accuracy using confusion matrix, F1-score, precision and recall respectively.
International Journal of Innovative Science and Research Technology
Title: A Comparative Study of Some Selected Classifiers on an Imbalanced Dataset for Sentiment Analysis
Description:
Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis.
For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions.
This research is to evaluate various machine learning algorithms for online dataset classification, where same set of data will be used to test four different machine learning algorithms: Naive Bayes, Support Vector machine, K-nearest neighbor and Decision tree.
In order to determine which machine learning model will perform best in sentiment analysis as a constant issue.
In this research, our primary goal is to identify the most effective machine learning model for sentiment analysis of English texts among the aforementioned classifiers.
Their robustness will be tested and classified with an imbalanced dataset Kaggle.
com a Machine learning repository.
The dataset will first undergo data preprocessing in order to enable analysis, and then feature extraction for the base classifiers performance and accuracy which will be carried out in Jupyter notebook from Anaconda.
Each machine learning algorithm performance scores will be calculated for higher accuracy using confusion matrix, F1-score, precision and recall respectively.
Related Results
A corpus-based study on Chinese sentiment parameters of Chinese sentiment discourse
A corpus-based study on Chinese sentiment parameters of Chinese sentiment discourse
Most previous work on sentiment identification and annotation has focused on the identification and annotation of attitudes and targets, while less work has been done on other sent...
Introduction to the Tafel v-bis Dataset: Death Duty Summary Information for The Netherlands, 1921
Introduction to the Tafel v-bis Dataset: Death Duty Summary Information for The Netherlands, 1921
Abstract
This article introduces a newly constructed dataset (i.e. the Tafel v-bis Dataset) containing summary information for all Dutch citizens who died in 1921 and were subject ...
Sentiment Analysis Using Machine Learning Approaches (Lexicon based on movie review dataset)
Sentiment Analysis Using Machine Learning Approaches (Lexicon based on movie review dataset)
Sentiment analysis or Opinion Mining or Emotion Artificial Intelligence is an on-going field which refers to the use of Natural Language Processing, analysis of text and is utiliz...
Maailmakirjanduse mõõtmisest meil ja mujal / Conceptualizations of World Literature in Estonia and Elsewhere
Maailmakirjanduse mõõtmisest meil ja mujal / Conceptualizations of World Literature in Estonia and Elsewhere
Teesid: Artikkel käsitleb maailmakirjanduse mõiste mahu ja sisu muutumist alates selle esilekerkimisest 19. sajandi algupoolel kuni tänapäeva käsitlusviisideni ja dilemmadeni, mill...
Schubert Winterreise Dataset
Schubert Winterreise Dataset
This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert’s song cycle
Winterreise
. Schubert’s semina...
Padova Emotional Dataset of Facial Expressions (PEDFE): A unique dataset of genuine and posed emotional facial expressions
Padova Emotional Dataset of Facial Expressions (PEDFE): A unique dataset of genuine and posed emotional facial expressions
AbstractFacial expressions are among the most powerful signals for human beings to convey their emotional states. Indeed, emotional facial datasets represent the most effective and...
Assessing the Performance of a Long Short-Term Memory Algorithm in the Dataset with Missing Values
Assessing the Performance of a Long Short-Term Memory Algorithm in the Dataset with Missing Values
This study was conducted to assess the performance of a long short-term memory algorithm (LSTM), which was suitable for time series prediction, in the multivariate dataset with mis...
State Legislative Elections, 1967–2003: Announcing the Completion of a Cleaned and Updated Dataset
State Legislative Elections, 1967–2003: Announcing the Completion of a Cleaned and Updated Dataset
More than 15 years—nine election cycles—have passed since a comprehensive set of state legislative election data was compiled and made available to researchers and practitioners: t...