TL;DR
This paper introduces DPGAN, a novel GAN architecture with a double pooling module that enhances semantic dependency modeling for layout-to-image translation, resulting in more realistic and semantically consistent images.
Contribution
The paper proposes a new Double Pooling GAN with a Double Pooling Module that captures both short- and long-range semantic dependencies, improving image quality and semantic consistency.
Findings
Outperforms state-of-the-art methods on five datasets.
Effective semantic dependency modeling improves image realism.
Double pooling modules are versatile and can be integrated into other GANs.
Abstract
In this paper, we address the task of layout-to-image translation, which aims to translate an input semantic layout to a realistic image. One open challenge widely observed in existing methods is the lack of effective semantic constraints during the image translation process, leading to models that cannot preserve the semantic information and ignore the semantic dependencies within the same object. To address this issue, we propose a novel Double Pooing GAN (DPGAN) for generating photo-realistic and semantically-consistent results from the input layout. We also propose a novel Double Pooling Module (DPM), which consists of the Square-shape Pooling Module (SPM) and the Rectangle-shape Pooling Module (RPM). Specifically, SPM aims to capture short-range semantic dependencies of the input layout with different spatial scales, while RPM aims to capture long-range semantic dependencies from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
