Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Similarity Matching of Pairs of Text using CACT Algorithm

View through CrossRef
In data mining, shorter text analysis is performed more widely for many applications. Based on the syntax of the language, it is very difficult to analyze the short text with several traditional tools of natural language processing and this is not applied correctly either. In short text, it is known that there are rare and insufficient data available and further it is difficult to identify semantic knowledge with the great noise and ambiguity of short texts. In this paper, the authors proposed to replace the coefficient of similarity of Cosine with the measure of similarity of Jaro-Winkler to obtain the coincidence of similarity between pairs of text (source text and target text). Jaro-Winkler does a better job of determining the similarity of the strings because it takes an order into account when using the positional indices to estimate relevance. It is presumed that the performance of CACT driven by Jaro-Wrinkler with respect to one-to-many data links offers optimized performance when compared to the operation of CACT driven by cosine. In this paper, the ensemble algorithm CACTS and SAE is adopted with Jaro-Winkler similarity approach. The new algorithm is employed for short text analysis and better results. An evaluation of our proposed concept is sufficient as validation.
Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP
Title: Similarity Matching of Pairs of Text using CACT Algorithm
Description:
In data mining, shorter text analysis is performed more widely for many applications.
Based on the syntax of the language, it is very difficult to analyze the short text with several traditional tools of natural language processing and this is not applied correctly either.
In short text, it is known that there are rare and insufficient data available and further it is difficult to identify semantic knowledge with the great noise and ambiguity of short texts.
In this paper, the authors proposed to replace the coefficient of similarity of Cosine with the measure of similarity of Jaro-Winkler to obtain the coincidence of similarity between pairs of text (source text and target text).
Jaro-Winkler does a better job of determining the similarity of the strings because it takes an order into account when using the positional indices to estimate relevance.
It is presumed that the performance of CACT driven by Jaro-Wrinkler with respect to one-to-many data links offers optimized performance when compared to the operation of CACT driven by cosine.
In this paper, the ensemble algorithm CACTS and SAE is adopted with Jaro-Winkler similarity approach.
The new algorithm is employed for short text analysis and better results.
An evaluation of our proposed concept is sufficient as validation.

Related Results

Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Let [Formula: see text] be a connected graph of order at least two with vertex set [Formula: see text]. For [Formula: see text], let [Formula: see text] denote the length of an [Fo...
ANALYSIS OF READING MATERIALS IN TEXTBOOK FOR GRADE XI SENIOR HIGH SCHOOL
ANALYSIS OF READING MATERIALS IN TEXTBOOK FOR GRADE XI SENIOR HIGH SCHOOL
This study aims to find out the GI and LD level, the text which has the highest GI and LD and what make the text has the highest GI and LD of Advanced Learning English 2 textbook. ...
Increased Fatty Acid Oxidation Protects Against Development of Metabolic Associated Steatotic Liver Disease
Increased Fatty Acid Oxidation Protects Against Development of Metabolic Associated Steatotic Liver Disease
Metabolic dysfunction-associated steatotic liver disease (MASLD), a condition characterized by fat accumulation in the liver, affects over 30% ...
A saturation problem in meshes
A saturation problem in meshes
Let [Formula: see text] and [Formula: see text] be graphs, where we view [Formula: see text] as the “host” graph and [Formula: see text] as a “forbidden” graph. A spanning subgraph...
2021 Census to Census Coverage Survey Matching Results.
2021 Census to Census Coverage Survey Matching Results.
The 2021 England and Wales Census was matched to the Census Coverage Survey (CCS). This was an essential requisite for estimating undercount in the Census. To ensure outputs could ...
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
String pattern matching is one of the important string operation. At present, the pattern matching algorithm of strings mainly includes BF algorithm, KMP algorithm, and improved KM...

Back to Top