DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation
Yipeng Leng, Qiangjuan Huang, Zhiyuan Wang, Yangyang Liu, Haoyu Zhang

TL;DR
DiffuseGAE introduces a novel autoencoder module that enhances disentangled, controllable, and high-fidelity image manipulation by leveraging a group-supervised autoencoder for diffusion models, enabling multi-attribute editing with efficient computation.
Contribution
The paper proposes Group-supervised AutoEncoder (GAE) for Diff-AE, improving disentanglement and enabling generic multi-attribute image manipulation with reduced computational costs.
Findings
Enables multiple-attribute image manipulation.
Achieves convincing sample quality and attribute alignment.
Reduces computational requirements compared to pixel-based methods.
Abstract
Diffusion probabilistic models (DPMs) have shown remarkable results on various image synthesis tasks such as text-to-image generation and image inpainting. However, compared to other generative methods like VAEs and GANs, DPMs lack a low-dimensional, interpretable, and well-decoupled latent code. Recently, diffusion autoencoders (Diff-AE) were proposed to explore the potential of DPMs for representation learning via autoencoding. Diff-AE provides an accessible latent space that exhibits remarkable interpretability, allowing us to manipulate image attributes based on latent codes from the space. However, previous works are not generic as they only operated on a few limited attributes. To further explore the latent space of Diff-AE and achieve a generic editing pipeline, we proposed a module called Group-supervised AutoEncoder(dubbed GAE) for Diff-AE to achieve better disentanglement on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Digital Media Forensic Detection
MethodsDiffusion
