A Review of Human-Object Interaction Detection
Yuxiao Wang, Yu Lei, Li Cui, Weiying Xue, Qi Liu, Zhenao Wei

TL;DR
This paper reviews recent advances in human-object interaction detection, covering datasets, methods, and emerging trends like zero-shot learning and large-scale language models, highlighting current challenges and future directions.
Contribution
It provides a comprehensive summary and analysis of recent developments in image-based HOI detection, including datasets, methodologies, and new research trends.
Findings
Analysis of two-stage and end-to-end detection methods
Discussion on zero-shot and weakly supervised learning in HOI
Exploration of large-scale language models in HOI detection
Abstract
Human-object interaction (HOI) detection plays a key role in high-level visual understanding, facilitating a deep comprehension of human activities. Specifically, HOI detection aims to locate the humans and objects involved in interactions within images or videos and classify the specific interactions between them. The success of this task is influenced by several key factors, including the accurate localization of human and object instances, as well as the correct classification of object categories and interaction relationships. This paper systematically summarizes and discusses the recent work in image-based HOI detection. First, the mainstream datasets involved in HOI relationship detection are introduced. Furthermore, starting with two-stage methods and end-to-end one-stage detection approaches, this paper comprehensively discusses the current developments in image-based HOI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications
