Javascript must be enabled to continue!

AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model

Multimodal sentiment analysis is an essential task in natural language processing which refers to the fact that machines can analyze and recognize emotions through logical reasoning and mathematical operations after learning multimodal emotional features. For the problem of how to consider the effective fusion of multimodal data and the relevance of multimodal data in multimodal sentiment analysis, we propose an attention-based mechanism feature relevance fusion multimodal sentiment analysis model (AFR-BERT). In the data pre-processing stage, text features are extracted using the pre-trained language model BERT (Bi-directional Encoder Representation from Transformers), and the BiLSTM (Bi-directional Long Short-Term Memory) is used to obtain the internal information of the audio. In the data fusion phase, the multimodal data fusion network effectively fuses multimodal features through the interaction of text and audio information. During the data analysis phase, the multimodal data association network analyzes the data by exploring the correlation of fused information between text and audio. In the data output phase, the model outputs the results of multimodal sentiment analysis. We conducted extensive comparative experiments on the publicly available sentiment analysis datasets CMU-MOSI and CMU-MOSEI. The experimental results show that AFR-BERT improves on the classical multimodal sentiment analysis model in terms of relevant performance metrics. In addition, ablation experiments and example analysis show that the multimodal data analysis network in AFR-BERT can effectively capture and analyze the sentiment features in text and audio.

Public Library of Science (PLoS)

Ji Mingyu Zhou Jiawei Wei Ning

PLOS ONE

2022

Title: AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model

Description:

For the problem of how to consider the effective fusion of multimodal data and the relevance of multimodal data in multimodal sentiment analysis, we propose an attention-based mechanism feature relevance fusion multimodal sentiment analysis model (AFR-BERT).

In the data pre-processing stage, text features are extracted using the pre-trained language model BERT (Bi-directional Encoder Representation from Transformers), and the BiLSTM (Bi-directional Long Short-Term Memory) is used to obtain the internal information of the audio.

In the data fusion phase, the multimodal data fusion network effectively fuses multimodal features through the interaction of text and audio information.

During the data analysis phase, the multimodal data association network analyzes the data by exploring the correlation of fused information between text and audio.

In the data output phase, the model outputs the results of multimodal sentiment analysis.

We conducted extensive comparative experiments on the publicly available sentiment analysis datasets CMU-MOSI and CMU-MOSEI.

The experimental results show that AFR-BERT improves on the classical multimodal sentiment analysis model in terms of relevant performance metrics.

In addition, ablation experiments and example analysis show that the multimodal data analysis network in AFR-BERT can effectively capture and analyze the sentiment features in text and audio.

Back

Related Results

The Nuclear Fusion Award

The Nuclear Fusion Award ceremony for 2009 and 2010 award winners was held during the 23rd IAEA Fusion Energy Conference in Daejeon. This time, both 2009 and 2010 award winners w...

Sentiment Analysis with Python: A Hands-on Approach

Sentiment Analysis is a rapidly growing field in Natural Language Processing (NLP) that aims to extract opinions, emotions, and attitudes expressed in text. It has a wide range o...

Development of a multimodal imaging system based on LIDAR

(English) Perception of the environment is an essential requirement for the fields of autonomous vehicles and robotics, that claim for high amounts of data to make reliable decisio...

Over-Sampling Effect in Pre-Training for Bidirectional Encoder Representations from Transformers (BERT) to Localize Medical BERT and Enhance Biomedical BERT (Preprint)

BACKGROUND Pre-training large-scale neural language models on raw texts has made a significant contribution to improving transfer learning in natural langua...

Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal

Abstract: This paper addresses the issue of multimodal re-performance, a concept developed by us, in view of the fact that the famous song “Imagine”, by John Lennon, was published ...

Nonproliferation and fusion power plants

Abstract The world now appears to be on the brink of realizing commercial fusion. As fusion energy progresses towards near-term commercial deployment, the question arises a...

A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT

Abstract Background: Pre-training large-scale neural language models on raw texts has been shown to make a significant contribution to a strategy for transfer learning in n...

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

With the development of the Internet, The content people share contains types of text, images, and videos, and utilizing these multimodal data for sentiment analysis has become an ...

Email:
Password:

Email: