Javascript must be enabled to continue!

Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos

Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams. However, it does not normally consider both long-distance aggregations and short-range motions. Thus, to solve these problems, we propose hierarchical excitation aggregation and disentanglement networks (Hi-EADNs), which include multiple frame excitation aggregation (MFEA) and a feature squeeze-and-excitation hierarchical disentanglement (SEHD) module. MFEA specifically uses long-short range motion modelling and calculates the feature-level temporal difference. The SEHD module utilizes these differences to optimize the weights of each spatiotemporal feature and excite motion-sensitive channels. Moreover, without introducing additional parameters, this feature information is processed with a series of squeezes and excitations, and multiple temporal aggregations with neighbourhoods can enhance the interaction of different motion frames. Extensive experimental results confirm our proposed Hi-EADN method effectiveness on the UCF101 and HMDB51 benchmark datasets, where the top-5 accuracy is 93.5% and 76.96%.

MDPI AG

Zeyuan Hu Eung-Joo Lee

Symmetry

2021

Title: Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos

Description:

Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams.

However, it does not normally consider both long-distance aggregations and short-range motions.

Thus, to solve these problems, we propose hierarchical excitation aggregation and disentanglement networks (Hi-EADNs), which include multiple frame excitation aggregation (MFEA) and a feature squeeze-and-excitation hierarchical disentanglement (SEHD) module.

MFEA specifically uses long-short range motion modelling and calculates the feature-level temporal difference.

The SEHD module utilizes these differences to optimize the weights of each spatiotemporal feature and excite motion-sensitive channels.

Moreover, without introducing additional parameters, this feature information is processed with a series of squeezes and excitations, and multiple temporal aggregations with neighbourhoods can enhance the interaction of different motion frames.

Extensive experimental results confirm our proposed Hi-EADN method effectiveness on the UCF101 and HMDB51 benchmark datasets, where the top-5 accuracy is 93.

5% and 76.

96%.

Back

Many human diseases are caused by mutations that induce misfolding and aggregation of the affected proteins, and are thought to result from failures in proteostasis. Pathways invol...

Analyzing Quality of YouTube Videos about Premature Ovarian Failure in the Past Decade

Abstract Background To determine the quality of YouTube videos about premature ovarian failure (POF), and variations in quality of professional YouTube videos about POF. M...

Rhytidectomy: Analysis of Videos Available Online

AbstractThe objective of this study was to examine YouTube videos related to rhytidectomy created by both physicians and nonphysicians to determine the content of the videos, the s...

Shedding Frequency and Motion of Jujube Fruits in Various Excitation Modes

HighlightsThe motion responses of fruits varied greatly in different excitation modes.The excitation modes included horizontal, vertical, and rotational.The shedding frequency of f...

Short video platforms as sources of health information about cervical cancer: A content and quality analysis

BackgroundThe development of short popular science video platforms helps people obtain health information, but no research has evaluated the information characteristics and quality...

Quais os comentários negativos e estratégias de enfrentamento mais comuns e eficazes na plataforma digital Youtube?

O presente estudo tem como objetivo compreender qual a tipologia de comentário negativo mais comum nos comentários referentes a vídeos postados na plataforma YouTube, bem como as t...

Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)

BACKGROUND Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...

Hierarchical Zeolites from Production Sand Waste as Catalysts for CO2 to Carbon Nanotubes CNTs: Exploration and Production Sustainability

Abstract This project targets to convert sand waste from oil & gas production, which is typically disposed as landfill, to be the higher-value products, called "...

Email:
Password:

Email:

Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos

Related Results