Exploiting Scene Graphs for Human-Object Interaction Detection

Tao He; Lianli Gao; Jingkuan Song; Yuan-Fang Li

arXiv:2108.08584·cs.CV·August 20, 2021

Exploiting Scene Graphs for Human-Object Interaction Detection

Tao He, Lianli Gao, Jingkuan Song, Yuan-Fang Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces SG2HOI, a novel approach that leverages scene graphs to incorporate high-level semantic relationships, significantly improving human-object interaction detection accuracy on benchmark datasets.

Contribution

The paper proposes a new method that embeds scene graphs into HOI detection, utilizing relation-aware message passing to enhance contextual understanding.

Findings

01

Outperforms state-of-the-art on V-COCO and HICO-DET datasets

02

Demonstrates the effectiveness of scene graph integration in HOI detection

03

Provides a new framework for relation-aware contextual modeling

Abstract

Human-Object Interaction (HOI) detection is a fundamental visual task aiming at localizing and recognizing interactions between humans and objects. Existing works focus on the visual and linguistic features of humans and objects. However, they do not capitalise on the high-level and semantic relationships present in the image, which provides crucial contextual and detailed relational knowledge for HOI inference. We propose a novel method to exploit this information, through the scene graph, for the Human-Object Interaction (SG2HOI) detection task. Our method, SG2HOI, incorporates the SG information in two ways: (1) we embed a scene graph into a global context clue, serving as the scene-specific environmental context; and (2) we build a relation-aware message-passing module to gather relationships from objects' neighborhood and transfer them into interactions. Empirical evaluation shows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ht014/sg2hoi
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Visual Attention and Saliency Detection · Human Pose and Action Recognition