Click Here: Human-Localized Keypoints as Guidance for Viewpoint Estimation
Ryan Szeto, Jason J. Corso

TL;DR
This paper introduces CH-CNN, a neural network that uses human-provided keypoint information to improve monocular viewpoint estimation, demonstrating significant accuracy gains on a new dataset and PASCAL 3D+.
Contribution
The paper presents a novel CNN architecture that incorporates human-annotated keypoints for enhanced viewpoint prediction, along with a new dataset of 3D keypoint annotations.
Findings
Achieves 90.7% mean class accuracy on PASCAL 3D+
Outperforms the previous state-of-the-art baseline by 5 percentage points
Validates the effectiveness of human-in-the-loop guidance in viewpoint estimation
Abstract
We motivate and address a human-in-the-loop variant of the monocular viewpoint estimation task in which the location and class of one semantic object keypoint is available at test time. In order to leverage the keypoint information, we devise a Convolutional Neural Network called Click-Here CNN (CH-CNN) that integrates the keypoint information with activations from the layers that process the image. It transforms the keypoint information into a 2D map that can be used to weigh features from certain parts of the image more heavily. The weighted sum of these spatial features is combined with global image features to provide relevant information to the prediction layers. To train our network, we collect a novel dataset of 3D keypoint annotations on thousands of CAD models, and synthetically render millions of images with 2D keypoint information. On test instances from PASCAL 3D+, our model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
