Category Level Object Pose Estimation via Neural Analysis-by-Synthesis
Xu Chen, Zijian Dong, Jie Song, Andreas Geiger, Otmar Hilliges

TL;DR
This paper introduces a neural analysis-by-synthesis approach for category-level object pose estimation that jointly optimizes pose, shape, and appearance using a neural image synthesis module, eliminating the need for explicit CAD models.
Contribution
It presents a neural network-based method that implicitly models object categories for pose estimation, enabling joint optimization without explicit object models.
Findings
High accuracy in orientation recovery from 2D images.
Successful full 6DOF pose estimation with depth data.
Efficient joint modeling of shape, appearance, and pose.
Abstract
Many object pose estimation algorithms rely on the analysis-by-synthesis framework which requires explicit representations of individual object instances. In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module that is capable of implicitly representing the appearance, shape and pose of entire object categories, thus rendering the need for explicit CAD models per object instance unnecessary. The image synthesis network is designed to efficiently span the pose configuration space so that model capacity can be used to capture the shape and local appearance (i.e., texture) variations jointly. At inference time the synthesized images are compared to the target via an appearance based loss and the error signal is backpropagated through the network to the input parameters. Keeping the network parameters fixed, this allows for iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
