ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model
Chengming Xu, Kai Hu, Qilin Wang, Donghao Luo, Jiangning Zhang,, Xiaobin Hu, Yanwei Fu, Chengjie Wang

TL;DR
ArtWeaver introduces a novel diffusion-based framework that improves stylized text-to-image generation by integrating mixed style descriptors and dynamic attention adapters, resulting in more accurate style and semantic consistency.
Contribution
It proposes two innovative modules—mixed style descriptor and dynamic attention adapter—to enhance style integration and semantic control in diffusion models.
Findings
Outperforms existing methods in style diversity and semantic accuracy
Produces images with better style consistency and semantic fidelity
Demonstrates robustness across various style references and text prompts
Abstract
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. In this paper, we present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion (SD) to address challenges such as misinterpreted styles and inconsistent semantics. Our approach introduces two innovative modules: the mixed style descriptor and the dynamic attention adapter. The mixed style descriptor enhances SD by combining content-aware and frequency-disentangled embeddings from CLIP with additional sources that capture global statistics and textual information, thus providing a richer blend of style-related and semantic-related knowledge. To achieve a better balance between adapter capacity and semantic control, the dynamic attention adapter is integrated into the diffusion UNet, dynamically calculating adaptation weights based on the style descriptors.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Computer Graphics and Visualization Techniques
MethodsContrastive Language-Image Pre-training · Adapter · Diffusion
