Compositional Scene Understanding through Inverse Generative Modeling
Yanbo Wang, Justin Dauwels, Yilun Du

TL;DR
This paper introduces a compositional inverse generative modeling approach for scene understanding, enabling robust inference of scene structure and global factors from images, even with novel objects and configurations.
Contribution
It proposes a novel compositional inverse modeling framework that generalizes scene understanding to unseen objects and scenes, leveraging smaller models and pretrained generative models.
Findings
Enables inference of objects and scene factors from natural images.
Generalizes to scenes with new objects and configurations.
Applicable to pretrained text-to-image models for zero-shot perception.
Abstract
Generative models have demonstrated remarkable abilities in generating high-fidelity visual content. In this work, we explore how generative models can further be used not only to synthesize visual content but also to understand the properties of a scene given a natural image. We formulate scene understanding as an inverse generative modeling problem, where we seek to find conditional parameters of a visual generative model to best fit a given natural image. To enable this procedure to infer scene structure from images substantially different than those seen during training, we further propose to build this visual generative model compositionally from smaller models over pieces of a scene. We illustrate how this procedure enables us to infer the set of objects in a scene, enabling robust generalization to new test scenes with an increased number of objects of new shapes. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · 3D Surveying and Cultural Heritage · Geological Modeling and Analysis
MethodsSparse Evolutionary Training
