HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced   Diffusion Models

Hanzhang Wang; Haoran Wang; Jinze Yang; Zhongrui Yu; Zeke Xie; Lei; Tian; Xinyan Xiao; Junjun Jiang; Xianming Liu; Mingming Sun

arXiv:2401.05870·cs.CV·January 12, 2024·2 cites

HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models

Hanzhang Wang, Haoran Wang, Jinze Yang, Zhongrui Yu, Zeke Xie, Lei, Tian, Xinyan Xiao, Junjun Jiang, Xianming Liu, Mingming Sun

PDF

Open Access

TL;DR

HiCAST introduces a flexible, user-controllable style transfer method using diffusion models, enabling customized stylization for images and videos with improved consistency and quality.

Contribution

The paper presents HiCAST, a novel style transfer approach that incorporates a Style Adapter into latent diffusion models for highly customizable and semantically guided stylization.

Findings

01

Outperforms state-of-the-art methods in visual quality and user satisfaction.

02

Achieves better cross-frame temporal consistency in video stylization.

03

Provides flexible manipulation of style information through the Style Adapter.

Abstract

The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video. Existing methods usually focus on pursuing the balance between style and content, whereas ignoring the significant demand for flexible and customized stylization results and thereby limiting their practical application. To address this critical issue, a novel AST approach namely HiCAST is proposed, which is capable of explicitly customizing the stylization results according to various source of semantic clues. In the specific, our model is constructed based on Latent Diffusion Model (LDM) and elaborately designed to absorb content and style instance as conditions of LDM. It is characterized by introducing of \textit{Style Adapter}, which allows user to flexibly manipulate the output results by aligning multi-level style information and intrinsic knowledge in LDM.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Cinema and Media Studies · Aesthetic Perception and Analysis

MethodsDiffusion · Latent Diffusion Model · Focus