Javascript must be enabled to continue!
DASYOLO: Dual-Attention-Synergistic YOLO for Cross-Modality Object Detection
View through CrossRef
Abstract
The fusion of infrared and visible images effectively overcomes the limitations of single modalities in object detection, demonstrating significant advantages in adverse environments such as low illumination and haze conditions. However, existing cross-modal object detection methods predominantly employ sequential fusion strategies in attention mechanism design, resulting in limited feature representation capabilities and computational inefficiency. To enhance the feature representation capability of detection models and improve computational efficiency, we innovatively introduce a synergistic mechanism to design a dual attention synergy cross-modal object detection network called DASYOLO. This network integrates a shallow feature enhancement module (BiAttention) and a cross-modal synergy attention module (DAS). First, the BiAttention module exploits the expression potential of shallow features through two complementary attention mechanisms, providing a robust data foundation for cross-modal feature synergistic interaction. Subsequently, considering the semantic complementarity of different feature channels in spatial distribution, the cross-modal synergy attention module (DAS) adopts a parallel architecture to simultaneously capture channel importance and spatial significance, achieving complementary advantages of both attention mechanisms through interactive learning strategies. This design provides spatial guidance and alleviates semantic differences, enhancing the model's feature discrimination capability and inference efficiency. Qualitative and quantitative results demonstrate that the proposed method achieves favorable computational efficiency while maintaining high detection accuracy, significantly improving the overall performance of cross-modal object detection.
Title: DASYOLO: Dual-Attention-Synergistic YOLO for Cross-Modality Object Detection
Description:
Abstract
The fusion of infrared and visible images effectively overcomes the limitations of single modalities in object detection, demonstrating significant advantages in adverse environments such as low illumination and haze conditions.
However, existing cross-modal object detection methods predominantly employ sequential fusion strategies in attention mechanism design, resulting in limited feature representation capabilities and computational inefficiency.
To enhance the feature representation capability of detection models and improve computational efficiency, we innovatively introduce a synergistic mechanism to design a dual attention synergy cross-modal object detection network called DASYOLO.
This network integrates a shallow feature enhancement module (BiAttention) and a cross-modal synergy attention module (DAS).
First, the BiAttention module exploits the expression potential of shallow features through two complementary attention mechanisms, providing a robust data foundation for cross-modal feature synergistic interaction.
Subsequently, considering the semantic complementarity of different feature channels in spatial distribution, the cross-modal synergy attention module (DAS) adopts a parallel architecture to simultaneously capture channel importance and spatial significance, achieving complementary advantages of both attention mechanisms through interactive learning strategies.
This design provides spatial guidance and alleviates semantic differences, enhancing the model's feature discrimination capability and inference efficiency.
Qualitative and quantitative results demonstrate that the proposed method achieves favorable computational efficiency while maintaining high detection accuracy, significantly improving the overall performance of cross-modal object detection.
Related Results
Lightweight fruit detection algorithms for low‐power computing devices
Lightweight fruit detection algorithms for low‐power computing devices
Abstract
A lightweight fruit detection algorithm is important to ensure real‐time detection on low‐power computing devices while maintaining detection accuracy. I...
AVS-YOLO: Object Detection in Aerial Visual Scene
AVS-YOLO: Object Detection in Aerial Visual Scene
Difficult object detection and class imbalance in object detection are the two main challenges faced by aerial image object detection. Difficult objects include small objects, obje...
Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
Adaptive Drop Approaches to Train Spiking-YOLO Network for Traffic Flow Counting
Adaptive Drop Approaches to Train Spiking-YOLO Network for Traffic Flow Counting
Abstract
Traffic flow counting is an object detection problem. YOLO (" You Only Look Once ") is a popular object detection network. Spiking-YOLO converts the YOLO network f...
When Does a Dual Matrix Have a Dual Generalized Inverse?
When Does a Dual Matrix Have a Dual Generalized Inverse?
This paper deals with the existence of various types of dual generalized inverses of dual matrices. New and foundational results on the necessary and sufficient conditions for vari...
YOLO-V2 (You Only Look Once)
YOLO-V2 (You Only Look Once)
The you-only-look-once (YOLO) v2 object detector uses a single stage object detection network. YOLO v2 is faster than other two-stage deep learning object detectors, such as region...
SD-YOLO: A Lightweight and High-Performance Deep Model for Small and Dense Object Detection
SD-YOLO: A Lightweight and High-Performance Deep Model for Small and Dense Object Detection
Abstract
Object detection in remote sensing imagery from unmanned aerial vehicles (UAVs) is crucial yet challenging, demanding efficient algorithms for high accuracy and re...
Object Detection Using CNN
Object Detection Using CNN
Object detection system using Convolutional Neural Network(CNN) that can accurately identify and classify objects in videos. The purpose of object detection using CNN to enhance te...

