Parallel Reasoning Network for Human-Object Interaction Detection

Huan Peng; Fenggang Liu; Yangguang Li; Bin Huang; Jing Shao; Nong; Sang; Changxin Gao

arXiv:2301.03510·cs.CV·January 10, 2023·5 cites

Parallel Reasoning Network for Human-Object Interaction Detection

Huan Peng, Fenggang Liu, Yangguang Li, Bin Huang, Jing Shao, Nong, Sang, Changxin Gao

PDF

Open Access

TL;DR

This paper introduces PR-Net, a transformer-based model with two independent predictors for human-object interaction detection, improving semantic understanding and localization accuracy.

Contribution

The paper proposes a novel parallel reasoning network with separate predictors for instance and relation understanding, addressing limitations of shared predictor models.

Findings

01

Achieved competitive results on HICO-DET and V-COCO benchmarks.

02

Effectively differentiates instance localization from relation understanding.

03

Improved semantic understanding in HOI detection.

Abstract

Human-Object Interaction (HOI) detection aims to learn how human interacts with surrounding objects. Previous HOI detection frameworks simultaneously detect human, objects and their corresponding interactions by using a predictor. Using only one shared predictor cannot differentiate the attentive field of instance-level prediction and relation-level prediction. To solve this problem, we propose a new transformer-based method named Parallel Reasoning Network(PR-Net), which constructs two independent predictors for instance-level localization and relation-level understanding. The former predictor concentrates on instance-level localization by perceiving instances' extremity regions. The latter broadens the scope of relation region to reach a better relation-level semantic understanding. Extensive experiments and analysis on HICO-DET benchmark exhibit that our PR-Net effectively alleviated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning