Javascript must be enabled to continue!
Research on the internal influence factors of the text multi-classification problem
View through CrossRef
This paper mainly deals with the classification of text type data. The statistics show that more than 8000 articles have been reached in all kinds of documents retrieved by the optical network. However, there are few papers on the factors that affect the classification of text. The text classification method used is important, but the internal factors sometimes play a great role, and even affect the success or failure of the whole text classification. In order to make up for this deficiency, this paper selects the Rocchio algorithm as the classification method, mainly from the category clustering density, class complexity, category definition, stop words and document’s length five internal factors, we tested their influences on text classification by the experiment. Experiment shows that the clustering density is higher and the complexity of the lower class, class definition is higher, the higher the accuracy of text classification, text classification effect is better, and better effect to text stop words, the length of the text does not directly affect the effect of text classification, but according to the text classification algorithm is more suitable to choose the length of the document.
Title: Research on the internal influence factors of the text multi-classification problem
Description:
This paper mainly deals with the classification of text type data.
The statistics show that more than 8000 articles have been reached in all kinds of documents retrieved by the optical network.
However, there are few papers on the factors that affect the classification of text.
The text classification method used is important, but the internal factors sometimes play a great role, and even affect the success or failure of the whole text classification.
In order to make up for this deficiency, this paper selects the Rocchio algorithm as the classification method, mainly from the category clustering density, class complexity, category definition, stop words and document’s length five internal factors, we tested their influences on text classification by the experiment.
Experiment shows that the clustering density is higher and the complexity of the lower class, class definition is higher, the higher the accuracy of text classification, text classification effect is better, and better effect to text stop words, the length of the text does not directly affect the effect of text classification, but according to the text classification algorithm is more suitable to choose the length of the document.
Related Results
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Afaan Oromo Multi-Label News Text Classification Using Deep Learning Approach
Afaan Oromo Multi-Label News Text Classification Using Deep Learning Approach
Abstract
Classification is a technique for categorizing textual data into a form of predefined categories. Due to its major consequences in regard to critical tasks such as...
Λc Physics at BESIII
Λc Physics at BESIII
In 2014 BESIII collected a data sample of 567 [Formula: see text] at [Formula: see text] = 4.6 GeV, which is just above the [Formula: see text] pair production threshold. By analyz...
Strong vb-dominating and vb-independent sets of a graph
Strong vb-dominating and vb-independent sets of a graph
Let [Formula: see text] be a graph. A vertex [Formula: see text] strongly (weakly) b-dominates block [Formula: see text] if [Formula: see text] ([Formula: see text]) for every vert...
Multi-label Emotion Classification on Social Media Comments using Deep learning
Multi-label Emotion Classification on Social Media Comments using Deep learning
Abstract
Social media is an online platform that people use to develop social networks or relationships with others. Every day, millions of people use different social medi...
Improving Medical Document Classification via Feature Engineering
Improving Medical Document Classification via Feature Engineering
<p dir="ltr">Document classification (DC) is the task of assigning the predefined labels to unseen documents by utilizing the model trained on the available labeled documents...
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using d...

