Deep Contextual Attention for Human-Object Interaction Detection
Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz, Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen

TL;DR
This paper introduces a contextual attention framework for human-object interaction detection that leverages context-aware features and adaptive attention to improve detection accuracy across multiple benchmarks.
Contribution
The paper presents a novel attention-based approach that incorporates context information for better human-object interaction detection, outperforming previous methods.
Findings
Outperforms state-of-the-art on V-COCO, HICO-DET, and HCVRD datasets.
Achieves a 4.4% relative gain in role mAP on V-COCO.
Effectively utilizes context to capture subtle human-object interactions.
Abstract
Human-object interaction detection is an important and relatively new class of visual relationship detection tasks, essential for deeper scene understanding. Most existing approaches decompose the problem into object localization and interaction recognition. Despite showing progress, these approaches only rely on the appearances of humans and objects and overlook the available context information, crucial for capturing subtle interactions between them. We propose a contextual attention framework for human-object interaction detection. Our approach leverages context by learning contextually-aware appearance features for human and object instances. The proposed attention module then adaptively selects relevant instance-centric context information to highlight image regions likely to contain human-object interactions. Experiments are performed on three benchmarks: V-COCO, HICO-DET and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Human Pose and Action Recognition
