ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model

Chengming Xu; Kai Hu; Qilin Wang; Donghao Luo; Jiangning Zhang,; Xiaobin Hu; Yanwei Fu; Chengjie Wang

arXiv:2405.15287·cs.CV·November 19, 2024

ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model

Chengming Xu, Kai Hu, Qilin Wang, Donghao Luo, Jiangning Zhang,, Xiaobin Hu, Yanwei Fu, Chengjie Wang

PDF

Open Access

TL;DR

ArtWeaver introduces a novel diffusion-based framework that improves stylized text-to-image generation by integrating mixed style descriptors and dynamic attention adapters, resulting in more accurate style and semantic consistency.

Contribution

It proposes two innovative modules—mixed style descriptor and dynamic attention adapter—to enhance style integration and semantic control in diffusion models.

Findings

01

Outperforms existing methods in style diversity and semantic accuracy

02

Produces images with better style consistency and semantic fidelity

03

Demonstrates robustness across various style references and text prompts

Abstract

Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. In this paper, we present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion (SD) to address challenges such as misinterpreted styles and inconsistent semantics. Our approach introduces two innovative modules: the mixed style descriptor and the dynamic attention adapter. The mixed style descriptor enhances SD by combining content-aware and frequency-disentangled embeddings from CLIP with additional sources that capture global statistics and textual information, thus providing a richer blend of style-related and semantic-related knowledge. To achieve a better balance between adapter capacity and semantic control, the dynamic attention adapter is integrated into the diffusion UNet, dynamically calculating adaptation weights based on the style descriptors.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Computer Graphics and Visualization Techniques

MethodsContrastive Language-Image Pre-training · Adapter · Diffusion