Stylistic Attribute Control in Latent Diffusion Models
Max Reimann, Benito Buchheim, J\"urgen D\"ollner

TL;DR
This paper introduces a method for fine-grained, disentangled control of stylistic attributes in latent diffusion models, enabling precise and continuous stylistic editing while preserving image content.
Contribution
It proposes a novel approach combining guidance composition, regularization loss, and enhanced DDIM inversion to achieve disentangled, controllable stylistic editing in diffusion models.
Findings
The method enables precise, continuous stylistic modifications.
It preserves original image semantics during editing.
Outperforms current text-based editing techniques in style control.
Abstract
Text-to-image diffusion models have revolutionized image synthesis and editing, but precise control over stylistic attributes remains a challenge, often causing unintended content modifications. We propose an approach for fine-grained parametric control of stylistic attributes in latent diffusion models by learning disentangled editing directions from synthetic datasets. We use guidance composition to close the domain gap between stylistically finetuned and foundation models, preserving the original image semantics while applying stylistic adjustments. To ensure consistent edits, we introduce a training regularization loss and enhance DDIM inversion with optimized null-conditional embeddings for real image editing. We validate our approach by learning from stylistically filtered synthetic datasets varying a range of stylistic attributes, including outlines, local contrast,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
