Novel Human-Object Interaction Detection via Adversarial Domain   Generalization

Yuhang Song; Wenbo Li; Lei Zhang; Jianwei Yang; Emre Kiciman; Hamid; Palangi; Jianfeng Gao; C.-C. Jay Kuo; and Pengchuan Zhang

arXiv:2005.11406·cs.CV·May 26, 2020·5 cites

Novel Human-Object Interaction Detection via Adversarial Domain Generalization

Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid, Palangi, Jianfeng Gao, C.-C. Jay Kuo, and Pengchuan Zhang

PDF

Open Access

TL;DR

This paper introduces an adversarial domain generalization framework to improve the detection of unseen human-object interactions by learning object-invariant features, significantly enhancing generalization performance on new HOI categories.

Contribution

The paper proposes a novel adversarial domain generalization approach for HOI detection, addressing the challenge of unseen object-predicate combinations and improving generalization to novel scenarios.

Findings

01

Up to 50% performance increase on a new HICO-DET split

02

Up to 125% improvement on UnRel dataset for novel HOI detection

03

Effective learning of object-invariant features for unseen interactions

Abstract

We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. As a result, most existing HOI methods heavily rely on object priors and can hardly generalize to unseen combinations. To tackle this problem, we propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction. To measure the performance improvement, we create a new split of the HICO-DET dataset, where the HOIs in the test set are all unseen triplet categories in the training set. Our experiments show that the proposed framework significantly increases the performance by up to 50%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition