Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Integrating Pose Features and Cross-Relationship Learning for Human–Object Interaction Detection

View through CrossRef
Background: The main challenge in human–object interaction detection (HOI) is how to accurately reason about ambiguous, complex, and difficult to recognize interactions. The model structure of the existing methods is relatively single, and the image input may be occluded and cannot be accurately recognized. Methods: In this paper, we design a Pose-Aware Interaction Network (PAIN) based on transformer architecture and human posture to address these issues through two innovations: A new feature fusion method is proposed, which fuses human pose features and image features early before the encoder to improve the feature expression ability, and the individual motion-related features are additionally strengthened by adding to the human branch; the Cross-Attention Relationship fusion Module (CARM) better fuses the three-branch output and captures the detailed relationship information of HOI. Results: The proposed method achieves 64.51%AProle#1, 66.42%AProle#2 on the public dataset V-COCO and 30.83% AP on HICO-DET, which can recognize HOI instances more accurately.
Title: Integrating Pose Features and Cross-Relationship Learning for Human–Object Interaction Detection
Description:
Background: The main challenge in human–object interaction detection (HOI) is how to accurately reason about ambiguous, complex, and difficult to recognize interactions.
The model structure of the existing methods is relatively single, and the image input may be occluded and cannot be accurately recognized.
Methods: In this paper, we design a Pose-Aware Interaction Network (PAIN) based on transformer architecture and human posture to address these issues through two innovations: A new feature fusion method is proposed, which fuses human pose features and image features early before the encoder to improve the feature expression ability, and the individual motion-related features are additionally strengthened by adding to the human branch; the Cross-Attention Relationship fusion Module (CARM) better fuses the three-branch output and captures the detailed relationship information of HOI.
Results: The proposed method achieves 64.
51%AProle#1, 66.
42%AProle#2 on the public dataset V-COCO and 30.
83% AP on HICO-DET, which can recognize HOI instances more accurately.

Related Results

Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
Deep learning for small object detection in images
Deep learning for small object detection in images
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the rapid development of deep learning in computer vision, especially deep convolutional neural network...
Detection of acne by deep learning object detection
Detection of acne by deep learning object detection
AbstractImportanceState-of-the art performance is achieved with a deep learning object detection model for acne detection. There is little current research on object detection in d...
Object Detection Using CNN
Object Detection Using CNN
Object detection system using Convolutional Neural Network(CNN) that can accurately identify and classify objects in videos. The purpose of object detection using CNN to enhance te...
Classification of Deep Learning Techniques for Object Detection
Classification of Deep Learning Techniques for Object Detection
The object detection framework recognises real-world objects within the frame of a moving photograph or computer-generated image. The object has a location to flow to through other...
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic 
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic 
Abstract Background: To minimize the risk of infection during the COVID-19 pandemic, the learning mode of universities in China has been adjusted, and the online learning o...
A novel deep learning‐based single shot multibox detector model for object detection in optical remote sensing images
A novel deep learning‐based single shot multibox detector model for object detection in optical remote sensing images
AbstractRemote sensing image object detection is widely used in civil and military fields. The important task is to detect objects such as ships, planes, airports, harbours and so ...
Contour Tracking
Contour Tracking
Abstract Object tracking is a fundamental problem in computer vision. It is generally required as a preprocessing step that is used to perform motion‐based object recogni...

Back to Top