DiffuseST: Unleashing the Capability of the Diffusion Model for Style   Transfer

Ying Hu; Chenyi Zhuang; Pan Gao

arXiv:2410.15007·cs.CV·October 22, 2024

DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer

Ying Hu, Chenyi Zhuang, Pan Gao

PDF

Open Access 1 Repo

TL;DR

DiffuseST introduces a training-free style transfer method that combines textual and spatial features using diffusion models, enabling balanced and controllable artistic style transfer without retraining.

Contribution

It proposes a novel approach that leverages textual embeddings and diffusion model properties to improve style transfer, avoiding the need for training or fine-tuning.

Findings

01

Effective and robust style transfer results.

02

Enhanced control over content and style balance.

03

Potential applicability to other tasks.

Abstract

Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image. Existing methods train specific networks or utilize pre-trained models to learn content and style features. However, they rely solely on textual or spatial representations that are inadequate to achieve the balance between content and style. In this work, we propose a novel and training-free approach for style transfer, combining textual embedding with spatial features and separating the injection of content or style. Specifically, we adopt the BLIP-2 encoder to extract the textual representation of the style image. We utilize the DDIM inversion technique to extract intermediate embeddings in content and style branches as spatial features. Finally, we harness the step-by-step property of diffusion models by separating the injection of content and style in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

i2-multimedia-lab/diffusest
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis

MethodsDiffusion