Javascript must be enabled to continue!

A Comparative Study of Some Selected Classifiers on an Imbalanced Dataset for Sentiment Analysis

Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis. For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions. This research is to evaluate various machine learning algorithms for online dataset classification, where same set of data will be used to test four different machine learning algorithms: Naive Bayes, Support Vector machine, K-nearest neighbor and Decision tree. In order to determine which machine learning model will perform best in sentiment analysis as a constant issue. In this research, our primary goal is to identify the most effective machine learning model for sentiment analysis of English texts among the aforementioned classifiers. Their robustness will be tested and classified with an imbalanced dataset Kaggle.com a Machine learning repository. The dataset will first undergo data preprocessing in order to enable analysis, and then feature extraction for the base classifiers performance and accuracy which will be carried out in Jupyter notebook from Anaconda. Each machine learning algorithm performance scores will be calculated for higher accuracy using confusion matrix, F1-score, precision and recall respectively.

International Journal of Innovative Science and Research Technology

Mohammed Ali Kawo Garba Muhammad Danlami Gabi Musa Sule Argungu

International Journal of Innovative Science and Research Technology (IJISRT)

2024

Title: A Comparative Study of Some Selected Classifiers on an Imbalanced Dataset for Sentiment Analysis

Description:

Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis.

For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions.

This research is to evaluate various machine learning algorithms for online dataset classification, where same set of data will be used to test four different machine learning algorithms: Naive Bayes, Support Vector machine, K-nearest neighbor and Decision tree.

In order to determine which machine learning model will perform best in sentiment analysis as a constant issue.

In this research, our primary goal is to identify the most effective machine learning model for sentiment analysis of English texts among the aforementioned classifiers.

Their robustness will be tested and classified with an imbalanced dataset Kaggle.

com a Machine learning repository.

The dataset will first undergo data preprocessing in order to enable analysis, and then feature extraction for the base classifiers performance and accuracy which will be carried out in Jupyter notebook from Anaconda.

Each machine learning algorithm performance scores will be calculated for higher accuracy using confusion matrix, F1-score, precision and recall respectively.

Back

Most previous work on sentiment identification and annotation has focused on the identification and annotation of attitudes and targets, while less work has been done on other sent...

Introduction to the Tafel v-bis Dataset: Death Duty Summary Information for The Netherlands, 1921

Abstract This article introduces a newly constructed dataset (i.e. the Tafel v-bis Dataset) containing summary information for all Dutch citizens who died in 1921 and were subject ...

Sentiment Analysis Using Machine Learning Approaches (Lexicon based on movie review dataset)

Sentiment analysis or Opinion Mining or Emotion Artificial Intelligence is an on-going field which refers to the use of Natural Language Processing, analysis of text and is utiliz...

Maailmakirjanduse mõõtmisest meil ja mujal / Conceptualizations of World Literature in Estonia and Elsewhere

Teesid: Artikkel käsitleb maailmakirjanduse mõiste mahu ja sisu muutumist alates selle esilekerkimisest 19. sajandi algupoolel kuni tänapäeva käsitlusviisideni ja dilemmadeni, mill...

Schubert Winterreise Dataset

This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert’s song cycle Winterreise . Schubert’s semina...

Padova Emotional Dataset of Facial Expressions (PEDFE): A unique dataset of genuine and posed emotional facial expressions

AbstractFacial expressions are among the most powerful signals for human beings to convey their emotional states. Indeed, emotional facial datasets represent the most effective and...

Assessing the Performance of a Long Short-Term Memory Algorithm in the Dataset with Missing Values

This study was conducted to assess the performance of a long short-term memory algorithm (LSTM), which was suitable for time series prediction, in the multivariate dataset with mis...

State Legislative Elections, 1967–2003: Announcing the Completion of a Cleaned and Updated Dataset

More than 15 years—nine election cycles—have passed since a comprehensive set of state legislative election data was compiled and made available to researchers and practitioners: t...

Recent Results

A Euclidean Distance-Based Matching Procedure for Nonrandomized Comparison Studies

For intervention programs that are applied in natural settings, randomization often is difficult or impossible to achieve. If treated individuals are compared with individuals from...

Advances in integrating different models assessing the impact of climate change on agriculture

The impact of climate change on the agricultural sector is causing considerable concern worldwide. Using modelling and simulation to analyse the impact of and ways to mitigate clim...

4E’s Are Too Many Why Enactive World-Making Does Not Need The Extended Mind Thesis

4E’s cognition – embodied, embedded, enacted, extended – replaces the cognitivist notion of world-mirroring with an active process of world-making: cognition needs no mental repres...

The Effect of Upright Lower Body Negative Pressure on Muscle Activity and Hemodynamics during Exercise

Purpose: The aim of the study was to evaluate the changes that occur in parameters relating to muscle work and muscle hemodynamicsunder the influence of upright Lower Body Negative...

Email:
Password:

Email:

A Comparative Study of Some Selected Classifiers on an Imbalanced Dataset for Sentiment Analysis

Related Results

Recent Results