Multi-Class Multi-Instance Count Conditioned Adversarial Image Generation
Amrutha Saseendran, Kathrin Skubch, Margret Keuper

TL;DR
This paper introduces a conditional GAN that generates high-quality images with specified object counts per class, extending StyleGAN2 with count conditioning and a counting regression network, demonstrated on complex datasets including a new CityCount dataset.
Contribution
The paper presents a novel multi-class, multi-instance count-conditioned GAN that can generate images with specified object counts, advancing content control in image synthesis.
Findings
The model successfully generates images matching specified object counts.
It performs well on complex backgrounds and diverse datasets.
Introduces the CityCount dataset for multi-class counting evaluation.
Abstract
Image generation has rapidly evolved in recent years. Modern architectures for adversarial training allow to generate even high resolution images with remarkable quality. At the same time, more and more effort is dedicated towards controlling the content of generated images. In this paper, we take one further step in this direction and propose a conditional generative adversarial network (GAN) that generates images with a defined number of objects from given classes. This entails two fundamental abilities (1) being able to generate high-quality images given a complex constraint and (2) being able to count object instances per class in a given image. Our proposed model modularly extends the successful StyleGAN2 architecture with a count-based conditioning as well as with a regression sub-network to count the number of generated objects per class during training. In experiments on three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Image Processing and 3D Reconstruction
MethodsWeight Demodulation · HuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Path Length Regularization · Convolution · StyleGAN2
