PlaneRecTR++: Unified Query Learning for Joint 3D Planar Reconstruction and Pose Estimation

Jingjia Shi; Shuaifeng Zhi; Kai Xu

arXiv:2307.13756·cs.CV·September 18, 2025

PlaneRecTR++: Unified Query Learning for Joint 3D Planar Reconstruction and Pose Estimation

Jingjia Shi, Shuaifeng Zhi, Kai Xu

PDF

Open Access 1 Repo

TL;DR

PlaneRecTR++ introduces a unified Transformer-based framework that simultaneously performs 3D planar reconstruction and pose estimation from images, eliminating the need for separate modules and initial pose estimation, leading to state-of-the-art results.

Contribution

It is the first to unify all sub-tasks of multi-view planar reconstruction and pose estimation into a single-stage, query-based Transformer framework without external supervision.

Findings

01

Achieves new state-of-the-art performance on multiple datasets.

02

Eliminates the need for initial pose estimation and external plane correspondence labels.

03

Demonstrates mutual benefits across sub-tasks through unified learning.

Abstract

The challenging task of 3D planar reconstruction from images involves several sub-tasks including frame-wise plane detection, segmentation, parameter regression and possibly depth prediction, along with cross-frame plane correspondence and relative camera pose estimation. Previous works adopt a divide and conquer strategy, addressing above sub-tasks with distinct network modules in a two-stage paradigm. Specifically, given an initial camera pose and per-frame plane predictions from the first stage, further exclusively designed modules relying on external plane correspondence labeling are applied to merge multi-view plane entities and produce refined camera pose. Notably, existing work fails to integrate these closely related sub-tasks into a unified framework, and instead addresses them separately and sequentially, which we identify as a primary source of performance limitations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sjingjia/planerectr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Medical Image Segmentation Techniques · Robotics and Sensor-Based Localization

MethodsNone