Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Object Tracking in RGB-T Videos Using Modal-Aware Attention Network and Competitive Learning

View through CrossRef
Object tracking in RGB-thermal (RGB-T) videos is increasingly used in many fields due to the all-weather and all-day working capability of the dual-modality imaging system, as well as the rapid development of low-cost and miniaturized infrared camera technology. However, it is still very challenging to effectively fuse dual-modality information to build a robust RGB-T tracker. In this paper, an RGB-T object tracking algorithm based on a modal-aware attention network and competitive learning (MaCNet) is proposed, which includes a feature extraction network, modal-aware attention network, and classification network. The feature extraction network adopts the form of a two-stream network to extract features from each modality image. The modal-aware attention network integrates the original data, establishes an attention model that characterizes the importance of different feature layers, and then guides the feature fusion to enhance the information interaction between modalities. The classification network constructs a modality-egoistic loss function through three parallel binary classifiers acting on the RGB branch, the thermal infrared branch, and the fusion branch, respectively. Guided by the training strategy of competitive learning, the entire network is fine-tuned in the direction of the optimal fusion of the dual modalities. Extensive experiments on several publicly available RGB-T datasets show that our tracker has superior performance compared to other latest RGB-T and RGB tracking approaches.
Title: Object Tracking in RGB-T Videos Using Modal-Aware Attention Network and Competitive Learning
Description:
Object tracking in RGB-thermal (RGB-T) videos is increasingly used in many fields due to the all-weather and all-day working capability of the dual-modality imaging system, as well as the rapid development of low-cost and miniaturized infrared camera technology.
However, it is still very challenging to effectively fuse dual-modality information to build a robust RGB-T tracker.
In this paper, an RGB-T object tracking algorithm based on a modal-aware attention network and competitive learning (MaCNet) is proposed, which includes a feature extraction network, modal-aware attention network, and classification network.
The feature extraction network adopts the form of a two-stream network to extract features from each modality image.
The modal-aware attention network integrates the original data, establishes an attention model that characterizes the importance of different feature layers, and then guides the feature fusion to enhance the information interaction between modalities.
The classification network constructs a modality-egoistic loss function through three parallel binary classifiers acting on the RGB branch, the thermal infrared branch, and the fusion branch, respectively.
Guided by the training strategy of competitive learning, the entire network is fine-tuned in the direction of the optimal fusion of the dual modalities.
Extensive experiments on several publicly available RGB-T datasets show that our tracker has superior performance compared to other latest RGB-T and RGB tracking approaches.

Related Results

CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
RGB-Guided Multi-Kernel Attention Feature Extraction Network for Hyperspectral Image Super-Resolution
RGB-Guided Multi-Kernel Attention Feature Extraction Network for Hyperspectral Image Super-Resolution
Hyperspectral image (HSI) super-resolution aims to reconstruct high-spatial-resolution images from their low-resolution counterparts while preserving critical spectral fidelity. Ex...
ANALISIS MODAL KERJA PADA KOPERASI SERBA USAHA DI KOTA METRO
ANALISIS MODAL KERJA PADA KOPERASI SERBA USAHA DI KOTA METRO
Modal kerja merupakan suatu kekayaan yang digunakan untuk membelanjai perusahaan sehari-hari. Modal kerja biasanya berbentuk uang kas, piutang, persediaan barang yang kesemuanya it...
Is a Fitbit a Diary? Self-Tracking and Autobiography
Is a Fitbit a Diary? Self-Tracking and Autobiography
Data becomes something of a mirror in which people see themselves reflected. (Sorapure 270)In a 2014 essay for The New Yorker, the humourist David Sedaris recounts an obsession spu...
Rhytidectomy: Analysis of Videos Available Online
Rhytidectomy: Analysis of Videos Available Online
AbstractThe objective of this study was to examine YouTube videos related to rhytidectomy created by both physicians and nonphysicians to determine the content of the videos, the s...
A SAM2-Driven RGB-T Annotation Pipeline with Thermal-Guided Refinement for Semantic Segmentation in Search-and-Rescue Scenes
A SAM2-Driven RGB-T Annotation Pipeline with Thermal-Guided Refinement for Semantic Segmentation in Search-and-Rescue Scenes
High-quality RGB–thermal infrared (RGB-T) semantic segmentation datasets are crucial for search-and-rescue (SAR) applications, yet their development is hindered by the scarcity of ...
Deep Attention Models for Human Tracking Using RGBD
Deep Attention Models for Human Tracking Using RGBD
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking...

Back to Top