PlaneRecTR++: Unified Query Learning for Joint 3D Planar Reconstruction and Pose Estimation
Jingjia Shi, Shuaifeng Zhi, Kai Xu

TL;DR
PlaneRecTR++ introduces a unified Transformer-based framework that simultaneously performs 3D planar reconstruction and pose estimation from images, eliminating the need for separate modules and initial pose estimation, leading to state-of-the-art results.
Contribution
It is the first to unify all sub-tasks of multi-view planar reconstruction and pose estimation into a single-stage, query-based Transformer framework without external supervision.
Findings
Achieves new state-of-the-art performance on multiple datasets.
Eliminates the need for initial pose estimation and external plane correspondence labels.
Demonstrates mutual benefits across sub-tasks through unified learning.
Abstract
The challenging task of 3D planar reconstruction from images involves several sub-tasks including frame-wise plane detection, segmentation, parameter regression and possibly depth prediction, along with cross-frame plane correspondence and relative camera pose estimation. Previous works adopt a divide and conquer strategy, addressing above sub-tasks with distinct network modules in a two-stage paradigm. Specifically, given an initial camera pose and per-frame plane predictions from the first stage, further exclusively designed modules relying on external plane correspondence labeling are applied to merge multi-view plane entities and produce refined camera pose. Notably, existing work fails to integrate these closely related sub-tasks into a unified framework, and instead addresses them separately and sequentially, which we identify as a primary source of performance limitations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Medical Image Segmentation Techniques · Robotics and Sensor-Based Localization
MethodsNone
