Human-Object Interaction Detection:A Quick Survey and Examination of Methods
Trevor Bergstrom, Humphrey Shi

TL;DR
This paper surveys the state-of-the-art in human-object interaction detection, analyzes the performance of key components in multi-stream CNN architectures, and provides insights into the HICO-DET benchmark dataset.
Contribution
It is the first comprehensive survey and component-wise performance analysis of human-object interaction detection methods, focusing on the HORCNN architecture.
Findings
Multi-stream CNNs are commonly used for human-object interaction detection.
Component analysis reveals the contribution of each input source to detection performance.
The HICO-DET dataset is a key benchmark for evaluating these methods.
Abstract
Human-object interaction detection is a relatively new task in the world of computer vision and visual semantic information extraction. With the goal of machines identifying interactions that humans perform on objects, there are many real-world use cases for the research in this field. To our knowledge, this is the first general survey of the state-of-the-art and milestone works in this field. We provide a basic survey of the developments in the field of human-object interaction detection. Many works in this field use multi-stream convolutional neural network architectures, which combine features from multiple sources in the input image. Most commonly these are the humans and objects in question, as well as the spatial quality of the two. As far as we are aware, there have not been in-depth studies performed that look into the performance of each component individually. In order to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Human Pose and Action Recognition
