Towards Improving the Generation Quality of Autoregressive Slot VAEs

Patrick Emami; Pan He; Sanjay Ranka; Anand Rangarajan

arXiv:2206.01370·cs.CV·November 29, 2023

Towards Improving the Generation Quality of Autoregressive Slot VAEs

Patrick Emami, Pan He, Sanjay Ranka, Anand Rangarajan

PDF

Open Access 1 Repo

TL;DR

This paper enhances autoregressive slot VAEs by conditioning on scene-level variables and learning a consistent object order, significantly improving unconditional scene generation quality.

Contribution

It introduces two novel improvements: scene-level conditioning and learned object ordering, to better model object correlations in scene generation.

Findings

01

Improved unconditional scene generation quality across three environments.

02

Validated effectiveness of scene-level conditioning and object ordering through ablation studies.

03

Achieved significant gains over baseline models in generating coherent multi-object scenes.

Abstract

Unconditional scene inference and generation are challenging to learn jointly with a single compositional model. Despite encouraging progress on models that extract object-centric representations (''slots'') from images, unconditional generation of scenes from slots has received less attention. This is primarily because learning the multi-object relations necessary to imagine coherent scenes is difficult. We hypothesize that most existing slot-based models have a limited ability to learn object correlations. We propose two improvements that strengthen object correlation learning. The first is to condition the slots on a global, scene-level variable that captures higher-order correlations between slots. Second, we address the fundamental lack of a canonical order for objects in images by proposing to learn a consistent order to use for the autoregressive generation of scene objects.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pemami4911/segregate-relate-imagine
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis

MethodsALIGN · Variational Inference