ContextHOI: Spatial Context Learning for Human-Object Interaction   Detection

Mingda Jia; Liming Zhao; Ge Li; Yun Zheng

arXiv:2412.09050·cs.CV·December 13, 2024

ContextHOI: Spatial Context Learning for Human-Object Interaction Detection

Mingda Jia, Liming Zhao, Ge Li, Yun Zheng

PDF

Open Access 1 Video

TL;DR

ContextHOI introduces a dual-branch framework that effectively captures spatial context for improved human-object interaction detection, especially in occluded or blurred scenarios, achieving state-of-the-art results.

Contribution

The paper proposes a novel dual-branch framework with context-aware supervision to enhance spatial context learning in HOI detection without extra background labels.

Findings

01

Achieves state-of-the-art performance on HICO-DET and v-coco benchmarks.

02

Excels in recognizing interactions with occluded or blurred instances.

03

Introduces the HICO-ambiguous benchmark for challenging HOI evaluation.

Abstract

Spatial contexts, such as the backgrounds and surroundings, are considered critical in Human-Object Interaction (HOI) recognition, especially when the instance-centric foreground is blurred or occluded. Recent advancements in HOI detectors are usually built upon detection transformer pipelines. While such an object-detection-oriented paradigm shows promise in localizing objects, its exploration of spatial context is often insufficient for accurately recognizing human actions. To enhance the capabilities of object detectors for HOI detection, we present a dual-branch framework named ContextHOI, which efficiently captures both object detection features and spatial contexts. In the context branch, we train the model to extract informative spatial context without requiring additional hand-craft background labels. Furthermore, we introduce context-aware spatial and semantic supervision to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ContextHOI: Spatial Context Learning for Human-Object Interaction Detection· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Geographic Information Systems Studies · Context-Aware Activity Recognition Systems