Novel Human-Object Interaction Detection via Adversarial Domain Generalization
Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid, Palangi, Jianfeng Gao, C.-C. Jay Kuo, and Pengchuan Zhang

TL;DR
This paper introduces an adversarial domain generalization framework to improve the detection of unseen human-object interactions by learning object-invariant features, significantly enhancing generalization performance on new HOI categories.
Contribution
The paper proposes a novel adversarial domain generalization approach for HOI detection, addressing the challenge of unseen object-predicate combinations and improving generalization to novel scenarios.
Findings
Up to 50% performance increase on a new HICO-DET split
Up to 125% improvement on UnRel dataset for novel HOI detection
Effective learning of object-invariant features for unseen interactions
Abstract
We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. As a result, most existing HOI methods heavily rely on object priors and can hardly generalize to unseen combinations. To tackle this problem, we propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction. To measure the performance improvement, we create a new split of the HICO-DET dataset, where the HOIs in the test set are all unseen triplet categories in the training set. Our experiments show that the proposed framework significantly increases the performance by up to 50%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
