Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

View through CrossRef
Plagiarism is the act of taking part or all of one's ideas in the form of documents or texts without including sources of information retrieval. This study aims to detect the similarity of text documents using the cosine similarity algorithm and weighting TF-IDF so that it can be used to determine the value of plagiarism. The document used for comparison of this text is an abstract of Indonesian. The results of the study, namely when stemming the similarity value is higher on average 10% than the stemming process is not done. This study produces a similarity value above 50% for documents with a high degree of similarity. Whereas documents with low similarity levels or no plagiarism produce similarity values ​​below 40%. With the method used in the preprocessing consisting of folding cases, tokenizing, removeal stopwords, and stemming. After the preprocessing process, the next step is to calculate the weighting of TF-IDF and the similarity value using cosine similarity so that it gets a percentage similarity value. Based on the experimental results of the cosine similarity algorithm and weighting TF-IDF, it can produce similarity values ​​from each comparative document
Title:
Description:
Plagiarism is the act of taking part or all of one's ideas in the form of documents or texts without including sources of information retrieval.
This study aims to detect the similarity of text documents using the cosine similarity algorithm and weighting TF-IDF so that it can be used to determine the value of plagiarism.
The document used for comparison of this text is an abstract of Indonesian.
The results of the study, namely when stemming the similarity value is higher on average 10% than the stemming process is not done.
This study produces a similarity value above 50% for documents with a high degree of similarity.
Whereas documents with low similarity levels or no plagiarism produce similarity values ​​below 40%.
With the method used in the preprocessing consisting of folding cases, tokenizing, removeal stopwords, and stemming.
After the preprocessing process, the next step is to calculate the weighting of TF-IDF and the similarity value using cosine similarity so that it gets a percentage similarity value.
Based on the experimental results of the cosine similarity algorithm and weighting TF-IDF, it can produce similarity values ​​from each comparative document.

Back to Top