Javascript must be enabled to continue!
Social Event Classification Based on Multimodal Masked Transformer Network
View through CrossRef
The key to multimodal social event classification is to fully and accurately utilize the features of both image and text modalities. However, most existing methods have the following limitations: (1) they simply concatenate the image features and text features of the event, and (2) there is irrelevant contextual information between different modalities, which leads to mutual interference. Therefore, it is not enough to only consider the relationship between the modalities of multimodal data, but also the irrelevant contextual information (i.e., regions or words) between the modalities. To overcome these limitations, a novel social event classification method based on multimodal masked transformer network (MMTN) is proposed. A better representation of text and image is learned through an image-text encoding network. Then, the obtained image and text representations are input into the multimodal masked transformer network to fuse the multimodal information, and the relationship between the modalities of multimodal information is modeled by calculating the similarity between the multimodal information, masking the irrelevant context between the modalities. Extensive experiments on two benchmark datasets show that the proposed multimodal masked transformer network model achieves state-of-the-art performance.
Title: Social Event Classification Based on Multimodal Masked Transformer Network
Description:
The key to multimodal social event classification is to fully and accurately utilize the features of both image and text modalities.
However, most existing methods have the following limitations: (1) they simply concatenate the image features and text features of the event, and (2) there is irrelevant contextual information between different modalities, which leads to mutual interference.
Therefore, it is not enough to only consider the relationship between the modalities of multimodal data, but also the irrelevant contextual information (i.
e.
, regions or words) between the modalities.
To overcome these limitations, a novel social event classification method based on multimodal masked transformer network (MMTN) is proposed.
A better representation of text and image is learned through an image-text encoding network.
Then, the obtained image and text representations are input into the multimodal masked transformer network to fuse the multimodal information, and the relationship between the modalities of multimodal information is modeled by calculating the similarity between the multimodal information, masking the irrelevant context between the modalities.
Extensive experiments on two benchmark datasets show that the proposed multimodal masked transformer network model achieves state-of-the-art performance.
Related Results
Automatic Load Sharing of Transformer
Automatic Load Sharing of Transformer
Transformer plays a major role in the power system. It works 24 hours a day and provides power to the load. The transformer is excessive full, its windings are overheated which lea...
Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal
Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal
Abstract: This paper addresses the issue of multimodal re-performance, a concept developed by us, in view of the fact that the famous song “Imagine”, by John Lennon, was published ...
LIFE CYCLE OF TRANSFORMER 110/X KV AND ITS VALUE
LIFE CYCLE OF TRANSFORMER 110/X KV AND ITS VALUE
In a deregulated environment, power companies are in the constant process of reducing the costs of operating power facilities, with the aim of optimally improving the quality of de...
ANALISIS PENGARUH MASA OPERASIONAL TERHADAP PENURUNAN KAPASITAS TRANSFORMATOR DISTRIBUSI DI PT PLN (PERSERO)
ANALISIS PENGARUH MASA OPERASIONAL TERHADAP PENURUNAN KAPASITAS TRANSFORMATOR DISTRIBUSI DI PT PLN (PERSERO)
One cause the interruption of transformer is loading that exceeds the capabilities of the transformer. The state of continuous overload will affect the age of the transformer and r...
DESIGNING A MULTIMODAL TRANSPORT NETWORK
DESIGNING A MULTIMODAL TRANSPORT NETWORK
Objective: To create a methodology for designing a multimodal transport network under various scenarios of socioeconomic development of the Russian Federation and its regions which...
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model
Multimodal sentiment analysis is an essential task in natural language processing which refers to the fact that machines can analyze and recognize emotions through logical reasonin...
PLC Based Load Sharing of Transformers
PLC Based Load Sharing of Transformers
The transformer is very expensive and bulky power system equipment. It runs and feed the load for 24 hours a day. Sometimes the load on the transformer unexpectedly rises above its...
Simulation modeling study on short circuit ability of distribution transformer
Simulation modeling study on short circuit ability of distribution transformer
Abstract
Under short circuit condition, the oil immersed distribution transformer will endure combined electro-thermal stress, eventually lead to the mechanical dama...

