Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Semantic Similarity Caculating based on BERT

View through CrossRef
The exploration of semantic similarity is a fundamental aspect of natural language processing, as it aids in comprehending the significance and usage of vocabulary present in a language. The advent of pre-training language models has significantly simplified the process of research in this field. This article delves into the methodology of utilizing the pre-trained language model, BERT, to calculate the semantic similarity among Chinese words. In order to conduct this study, we first trained our own model using the bert-base-chinese pre-trained model. This allowed us to acquire the word embeddings for every single word, which served as the basis for calculating semantic similarity. Essentially, word embeddings are vector-based depictions of words that encapsulate word’s significance and surroundings, allowing for the measurement of the semantic similarity between words. Next, we executed a sequence of experiments to assess the efficiency of the BERT model in managing semantic similarity tasks within the Chinese language. The results were encouraging, as the BERT model demonstrated remarkable performance in these tasks. Furthermore, it was observed that the BERT model outperformed traditional methods in terms of performance and generalization capabilities. This study, therefore, underscores the potential of the BERT model in natural language processing, particularly in the Chinese language. This emphasizes the model’s capacity to accurately calculate semantic similarity, paving the way for its widespread adoption in related fields.
Title: Semantic Similarity Caculating based on BERT
Description:
The exploration of semantic similarity is a fundamental aspect of natural language processing, as it aids in comprehending the significance and usage of vocabulary present in a language.
The advent of pre-training language models has significantly simplified the process of research in this field.
This article delves into the methodology of utilizing the pre-trained language model, BERT, to calculate the semantic similarity among Chinese words.
In order to conduct this study, we first trained our own model using the bert-base-chinese pre-trained model.
This allowed us to acquire the word embeddings for every single word, which served as the basis for calculating semantic similarity.
Essentially, word embeddings are vector-based depictions of words that encapsulate word’s significance and surroundings, allowing for the measurement of the semantic similarity between words.
Next, we executed a sequence of experiments to assess the efficiency of the BERT model in managing semantic similarity tasks within the Chinese language.
The results were encouraging, as the BERT model demonstrated remarkable performance in these tasks.
Furthermore, it was observed that the BERT model outperformed traditional methods in terms of performance and generalization capabilities.
This study, therefore, underscores the potential of the BERT model in natural language processing, particularly in the Chinese language.
This emphasizes the model’s capacity to accurately calculate semantic similarity, paving the way for its widespread adoption in related fields.

Related Results

A Semantic Orthogonal Mapping Method Through Deep-Learning for Semantic Computing
A Semantic Orthogonal Mapping Method Through Deep-Learning for Semantic Computing
In order to realize an artificial intelligent system, a basic mechanism should be provided for expressing and processing the semantic. We have presented semantic computing models i...
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
Abstract Background: Pre-training large-scale neural language models on raw texts has been shown to make a significant contribution to a strategy for transfer learning in n...
Exploiting Wikipedia Semantics for Computing Word Associations
Exploiting Wikipedia Semantics for Computing Word Associations
<p><b>Semantic association computation is the process of automatically quantifying the strength of a semantic connection between two textual units based on various lexi...
Similarity Search with Data Missing
Similarity Search with Data Missing
Similarity search is a fundamental research problem with broad applications in various research fields, including data mining, information retrieval, and machine learning. The core...
Semantic Excel: An Introduction to a User-Friendly Online Software Application for Statistical Analyses of Text Data
Semantic Excel: An Introduction to a User-Friendly Online Software Application for Statistical Analyses of Text Data
Semantic Excel (www.semanticexcel.com) is an online software application with a simple, yet powerful interface enabling users to perform statistical analyses on texts. The purpose ...
Detecting Redundant Health Survey Questions Using Language-agnostic BERT Sentence Embedding (LaBSE) (Preprint)
Detecting Redundant Health Survey Questions Using Language-agnostic BERT Sentence Embedding (LaBSE) (Preprint)
BACKGROUND As the importance of PGHD in healthcare and research has increased, efforts to standardize survey-based PGHD to improve its usability and interop...
ALBERT-QM: An ALBERT Based Method for Chinese Health Related Question Matching (Preprint)
ALBERT-QM: An ALBERT Based Method for Chinese Health Related Question Matching (Preprint)
BACKGROUND Question answering (QA) system is widely used in web-based health-care applications. Health consumers likely asked similar questions in various n...

Back to Top