Perspective Plane Program Induction from a Single Image
Yikai Li, Jiayuan Mao, Xiuming Zhang, William T. Freeman, Joshua B., Tenenbaum, Jiajun Wu

TL;DR
This paper introduces P3I, a neuro-symbolic approach that infers holistic scene representations from a single image, enabling improved perspective correction and image manipulation.
Contribution
The paper proposes a novel framework, P3I, combining search and gradient methods for joint scene and camera pose inference from a single image.
Findings
P3I outperforms baselines in camera pose estimation
P3I improves global scene structure inference
P3I enhances downstream image manipulation tasks
Abstract
We study the inverse graphics problem of inferring a holistic representation for natural images. Given an input image, our goal is to induce a neuro-symbolic, program-like representation that jointly models camera poses, object locations, and global scene structures. Such high-level, holistic scene representations further facilitate low-level image manipulation tasks such as inpainting. We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image. The benefits of such joint inference are two-fold: scene regularity serves as a new cue for perspective correction, and in turn, correct perspective correction leads to a simplified scene structure, similar to how the correct shape leads to the most regular texture in shape from texture. Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Perspective Plane Program Induction From a Single Image· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques
