DTF-Net: Category-Level Pose Estimation and Shape Reconstruction via Deformable Template Field
Haowen Wang, Zhipeng Fan, Zhen Zhao, Zhengping Che, Zhiyuan Xu, Dong, Liu, Feifei Feng, Yakun Huang, Xiuquan Qiao, Jian Tang

TL;DR
The paper introduces DTF-Net, a novel neural framework that models category-level object shapes and poses using deformable templates and implicit fields, improving accuracy in open-world scene reconstruction and pose estimation.
Contribution
It proposes a deformable template field for shape representation and a pose regression module, enabling robust category-level pose estimation and shape reconstruction from RGB-D images.
Findings
Outperforms existing methods on REAL275 and CAMERA25 datasets.
Effectively supports robotic grasping tasks.
Handles intra-category shape variations with high accuracy.
Abstract
Estimating 6D poses and reconstructing 3D shapes of objects in open-world scenes from RGB-depth image pairs is challenging. Many existing methods rely on learning geometric features that correspond to specific templates while disregarding shape variations and pose differences among objects in the same category. As a result, these methods underperform when handling unseen object instances in complex environments. In contrast, other approaches aim to achieve category-level estimation and reconstruction by leveraging normalized geometric structure priors, but the static prior-based reconstruction struggles with substantial intra-class variations. To solve these problems, we propose the DTF-Net, a novel framework for pose estimation and shape reconstruction based on implicit neural fields of object categories. In DTF-Net, we design a deformable template field to represent the general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
