PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng

TL;DR
This paper introduces PPDM, a real-time, single-stage HOI detection method that outperforms existing approaches in speed and accuracy by using parallel point detection and matching, enabling efficient human-object interaction recognition.
Contribution
The paper presents a novel parallel architecture for HOI detection that improves efficiency and accuracy by predicting points and matching them in a single stage, and introduces a new HOI dataset.
Findings
Achieves 37 fps on a Titan XP GPU, outperforming all existing methods.
First real-time HOI detection method.
Provides a new HOI dataset, HOI-A.
Abstract
We propose a single-stage Human-Object Interaction (HOI) detection method that has outperformed all existing methods on HICO-DET dataset at 37 fps on a single Titan XP GPU. It is the first real-time HOI detection method. Conventional HOI detection methods are composed of two stages, i.e., human-object proposals generation, and proposals classification. Their effectiveness and efficiency are limited by the sequential and separate architecture. In this paper, we propose a Parallel Point Detection and Matching (PPDM) HOI detection framework. In PPDM, an HOI is defined as a point triplet < human point, interaction point, object point>. Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points. PPDM contains two parallel branches, namely point detection branch and point matching branch. The point detection branch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Human Pose and Action Recognition
