Generating Annotated High-Fidelity Images Containing Multiple Coherent   Objects

Bryan G. Cardenas; Devanshu Arya; Deepak K. Gupta

arXiv:2006.12150·cs.CV·July 19, 2021

Generating Annotated High-Fidelity Images Containing Multiple Coherent Objects

Bryan G. Cardenas, Devanshu Arya, Deepak K. Gupta

PDF

1 Repo

TL;DR

This paper introduces a novel multi-object image generation framework that produces high-fidelity, coherent images with object annotations without needing explicit contextual information, useful for domains like medical imaging.

Contribution

The proposed method combines VQ-VAE with autoregressive priors PixelSNAIL and LayoutPixelSNAIL to generate multi-object images with preserved spatial and semantic coherence without auxiliary input.

Findings

01

Outperforms state-of-the-art multi-object generative methods on Multi-MNIST and CLEVR datasets.

02

Generated images maintain high fidelity and object coherence.

03

Augmenting training data with generated images improves model performance in medical imaging.

Abstract

Recent developments related to generative models have made it possible to generate diverse high-fidelity images. In particular, layout-to-image generation models have gained significant attention due to their capability to generate realistic complex images containing distinct objects. These models are generally conditioned on either semantic layouts or textual descriptions. However, unlike natural images, providing auxiliary information can be extremely hard in domains such as biomedical imaging and remote sensing. In this work, we propose a multi-object generation framework that can synthesize images with multiple objects without explicitly requiring their contextual information during the generation process. Based on a vector-quantized variational autoencoder (VQ-VAE) backbone, our model learns to preserve spatial coherency within an image as well as semantic coherency between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Cynetics/MSGNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsVQ-VAE · Solana Customer Service Number +1-833-534-1729