CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

Wenbo Nie; Zixiang Li; Renshuai Tao; Bin Wu; Yunchao Wei; Yao Zhao

arXiv:2602.14464·cs.CV·April 2, 2026

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

Wenbo Nie, Zixiang Li, Renshuai Tao, Bin Wu, Yunchao Wei, Yao Zhao

PDF

1 Video

TL;DR

CoCoDiff is a training-free, diffusion-based style transfer method that achieves fine-grained, semantically consistent stylization by leveraging pixel-wise correspondence and cycle consistency.

Contribution

It introduces a novel, training-free framework that uses pretrained diffusion models for detailed, semantically aligned style transfer without extra supervision.

Findings

01

Outperforms existing methods in visual quality and quantitative metrics.

02

Achieves object and region-level stylization while preserving geometry and details.

03

Operates without additional training or annotations.

Abstract

Transferring visual style between images while preserving semantic correspondence between similar objects remains a central challenge in computer vision. While existing methods have made great strides, most of them operate at global level but overlook region-wise and even pixel-wise semantic correspondence. To address this, we propose CoCoDiff, a novel training-free and low-cost style transfer framework that leverages pretrained latent diffusion models to achieve fine-grained, semantically consistent stylization. We identify that correspondence cues within generative diffusion models are under-explored and that content consistency across semantically matched regions is often neglected. CoCoDiff introduces a pixel-wise semantic correspondence module that mines intermediate diffusion features to construct a dense alignment map between content and style images. Furthermore, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer· slideslive