R-CNNs for Pose Estimation and Action Detection
Georgia Gkioxari, Bharath Hariharan, Ross Girshick, Jitendra Malik

TL;DR
This paper introduces R-CNN based methods for pose estimation and action detection in images, achieving state-of-the-art results and providing a new dataset for action localization and classification.
Contribution
The paper develops R-CNN models tailored for keypoint and action prediction, and introduces a new dataset for joint localization and action classification.
Findings
State-of-the-art results on PASCAL VOC for pose and action prediction
Effective R-CNN training with task-specific loss functions
New dataset for action detection with promising baseline results
Abstract
We present convolutional neural networks for the tasks of keypoint (pose) prediction and action classification of people in unconstrained images. Our approach involves training an R-CNN detector with loss functions depending on the task being tackled. We evaluate our method on the challenging PASCAL VOC dataset and compare it to previous leading approaches. Our method gives state-of-the-art results for keypoint and action prediction. Additionally, we introduce a new dataset for action detection, the task of simultaneously localizing people and classifying their actions, and present results using our approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
MethodsSupport Vector Machine · Max Pooling · Convolution · R-CNN
