Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos

View through CrossRef
Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams. However, it does not normally consider both long-distance aggregations and short-range motions. Thus, to solve these problems, we propose hierarchical excitation aggregation and disentanglement networks (Hi-EADNs), which include multiple frame excitation aggregation (MFEA) and a feature squeeze-and-excitation hierarchical disentanglement (SEHD) module. MFEA specifically uses long-short range motion modelling and calculates the feature-level temporal difference. The SEHD module utilizes these differences to optimize the weights of each spatiotemporal feature and excite motion-sensitive channels. Moreover, without introducing additional parameters, this feature information is processed with a series of squeezes and excitations, and multiple temporal aggregations with neighbourhoods can enhance the interaction of different motion frames. Extensive experimental results confirm our proposed Hi-EADN method effectiveness on the UCF101 and HMDB51 benchmark datasets, where the top-5 accuracy is 93.5% and 76.96%.
Title: Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos
Description:
Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams.
However, it does not normally consider both long-distance aggregations and short-range motions.
Thus, to solve these problems, we propose hierarchical excitation aggregation and disentanglement networks (Hi-EADNs), which include multiple frame excitation aggregation (MFEA) and a feature squeeze-and-excitation hierarchical disentanglement (SEHD) module.
MFEA specifically uses long-short range motion modelling and calculates the feature-level temporal difference.
The SEHD module utilizes these differences to optimize the weights of each spatiotemporal feature and excite motion-sensitive channels.
Moreover, without introducing additional parameters, this feature information is processed with a series of squeezes and excitations, and multiple temporal aggregations with neighbourhoods can enhance the interaction of different motion frames.
Extensive experimental results confirm our proposed Hi-EADN method effectiveness on the UCF101 and HMDB51 benchmark datasets, where the top-5 accuracy is 93.
5% and 76.
96%.

Related Results

Natural genetic variation and an alternative physiological state modify polyglutamine aggregation and toxicity in C. elegans
Natural genetic variation and an alternative physiological state modify polyglutamine aggregation and toxicity in C. elegans
Many human diseases are caused by mutations that induce misfolding and aggregation of the affected proteins, and are thought to result from failures in proteostasis. Pathways invol...
Rhytidectomy: Analysis of Videos Available Online
Rhytidectomy: Analysis of Videos Available Online
AbstractThe objective of this study was to examine YouTube videos related to rhytidectomy created by both physicians and nonphysicians to determine the content of the videos, the s...
Shedding Frequency and Motion of Jujube Fruits in Various Excitation Modes
Shedding Frequency and Motion of Jujube Fruits in Various Excitation Modes
HighlightsThe motion responses of fruits varied greatly in different excitation modes.The excitation modes included horizontal, vertical, and rotational.The shedding frequency of f...
Short video platforms as sources of health information about cervical cancer: A content and quality analysis
Short video platforms as sources of health information about cervical cancer: A content and quality analysis
BackgroundThe development of short popular science video platforms helps people obtain health information, but no research has evaluated the information characteristics and quality...
Quais os comentários negativos e estratégias de enfrentamento mais comuns e eficazes na plataforma digital Youtube?
Quais os comentários negativos e estratégias de enfrentamento mais comuns e eficazes na plataforma digital Youtube?
O presente estudo tem como objetivo compreender qual a tipologia de comentário negativo mais comum nos comentários referentes a vídeos postados na plataforma YouTube, bem como as t...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
TarDis: Achieving Robust and Structured Disentanglement of Multiple Covariates
TarDis: Achieving Robust and Structured Disentanglement of Multiple Covariates
Summary Addressing challenges in domain invariance within single-cell genomics necessitates innovative strategies to manage the heterogeneity of ...

Back to Top