MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
Tingbo Hou, Adel Ahmadyan, Liangkai Zhang, Jianing Wei, and Matthias, Grundmann

TL;DR
This paper introduces MobilePose, a lightweight real-time 3D pose estimation method for unseen objects that leverages weak shape supervision, achieving high accuracy on mobile devices.
Contribution
It proposes two mobile-friendly networks, MobilePose-Base and MobilePose-Shape, incorporating shape features and weak supervision to improve pose estimation.
Findings
Achieves 36 FPS on Galaxy S20
Outperforms previous single-shot methods in accuracy
Uses 2-3% fewer parameters than prior models
Abstract
In this paper, we address the problem of detecting unseen objects from RGB images and estimating their poses in 3D. We propose two mobile friendly networks: MobilePose-Base and MobilePose-Shape. The former is used when there is only pose supervision, and the latter is for the case when shape supervision is available, even a weak one. We revisit shape features used in previous methods, including segmentation and coordinate map. We explain when and why pixel-level shape supervision can improve pose estimation. Consequently, we add shape prediction as an intermediate layer in the MobilePose-Shape, and let the network learn pose from shape. Our models are trained on mixed real and synthetic data, with weak and noisy shape supervision. They are ultra lightweight that can run in real-time on modern mobile devices (e.g. 36 FPS on Galaxy S20). Comparing with previous single-shot solutions, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robot Manipulation and Learning · Advanced Vision and Imaging
