Towards Highly Realistic Artistic Style Transfer via Stable Diffusion   with Step-aware and Layer-aware Prompt

Zhanjie Zhang; Quanwei Zhang; Huaizhong Lin; Wei Xing; Juncheng Mo,; Shuaicheng Huang; Jinheng Xie; Guangyuan Li; Junsheng Luan; Lei Zhao; Dalong; Zhang; Lixia Chen

arXiv:2404.11474·cs.CV·August 13, 2024·2 cites

Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt

Zhanjie Zhang, Quanwei Zhang, Huaizhong Lin, Wei Xing, Juncheng Mo,, Shuaicheng Huang, Jinheng Xie, Guangyuan Li, Junsheng Luan, Lei Zhao, Dalong, Zhang, Lixia Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces LSAST, a diffusion model-based artistic style transfer method that produces highly realistic stylized images while preserving content structure, using a novel prompt space and inversion technique.

Contribution

The paper proposes a new diffusion-based style transfer approach with a step-aware and layer-aware prompt space and a novel inversion method, enhancing realism and content preservation.

Findings

01

Outperforms state-of-the-art style transfer methods in realism

02

Effectively preserves content structure in stylized images

03

Reduces artifacts and disharmonious patterns

Abstract

Artistic style transfer aims to transfer the learned artistic style onto an arbitrary content image, generating artistic stylized images. Existing generative adversarial network-based methods fail to generate highly realistic stylized images and always introduce obvious artifacts and disharmonious patterns. Recently, large-scale pre-trained diffusion models opened up a new way for generating highly realistic artistic stylized images. However, diffusion model-based methods generally fail to preserve the content structure of input content images well, introducing some undesired content structure and style patterns. To address the above problems, we propose a novel pre-trained diffusion-based artistic style transfer method, called LSAST, which can generate highly realistic artistic stylized images while preserving the content structure of input content images well, without bringing obvious…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jamie-cheung/lsast
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Music Technology and Sound Studies · Computer Graphics and Visualization Techniques

MethodsSparse Evolutionary Training · Diffusion