PlaneFormers: From Sparse View Planes to 3D Reconstruction

Samir Agarwala; Linyi Jin; Chris Rockwell; David F. Fouhey

arXiv:2208.04307·cs.CV·August 9, 2022

PlaneFormers: From Sparse View Planes to 3D Reconstruction

Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey

PDF

Open Access 1 Repo

TL;DR

PlaneFormers introduces a transformer-based method for reconstructing 3D planar surfaces from limited-overlap images, effectively integrating 3D reasoning, correspondence, and camera pose estimation in a unified framework.

Contribution

It presents the PlaneFormer, a novel transformer-based approach that simplifies and improves 3D scene reconstruction from sparse view images.

Findings

01

Outperforms prior optimization-based methods

02

Effective in scenes with limited image overlap

03

Highlights importance of 3D-specific design choices

Abstract

We present an approach for the planar surface reconstruction of a scene from images with limited overlap. This reconstruction task is challenging since it requires jointly reasoning about single image 3D reconstruction, correspondence between images, and the relative camera pose between images. Past work has proposed optimization-based approaches. We introduce a simpler approach, the PlaneFormer, that uses a transformer applied to 3D-aware plane tokens to perform 3D reasoning. Our experiments show that our approach is substantially more effective than prior work, and that several 3D-specific design decisions are crucial for its success.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samiragarwala/PlaneFormers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques