DR-GAN: Distribution Regularization for Text-to-Image Generation
Hongchen Tan, Xiuping Liu, Baocai Yin, Xin Li

TL;DR
This paper introduces DR-GAN, a novel text-to-image generation model that employs distribution regularization through semantic disentangling and distribution normalization modules, improving the quality and realism of generated images.
Contribution
The paper proposes two innovative modules, SDM and DNM, which enhance distribution learning and semantic extraction in text-to-image synthesis, representing a significant advancement over existing models.
Findings
Achieved competitive results on public datasets.
Demonstrated improved semantic consistency in generated images.
Showed effectiveness of distribution normalization in GAN training.
Abstract
This paper presents a new Text-to-Image generation model, named Distribution Regularization Generative Adversarial Network (DR-GAN), to generate images from text descriptions from improved distribution learning. In DR-GAN, we introduce two novel modules: a Semantic Disentangling Module (SDM) and a Distribution Normalization Module (DNM). SDM combines the spatial self-attention mechanism and a new Semantic Disentangling Loss (SDL) to help the generator distill key semantic information for the image generation. DNM uses a Variational Auto-Encoder (VAE) to normalize and denoise the image latent distribution, which can help the discriminator better distinguish synthesized images from real images. DNM also adopts a Distribution Adversarial Loss (DAL) to guide the generator to align with normalized real image distributions in the latent space. Extensive experiments on two public datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Video Analysis and Summarization
MethodsALIGN
