OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
Yiren Song, Cheng Liu, Mike Zheng Shou

TL;DR
OmniConsistency introduces a universal, plug-and-play consistency module for diffusion-based image stylization, significantly improving visual coherence and style preservation across complex scenes and style transfer pipelines.
Contribution
It proposes a novel in-context learning framework and a two-stage training strategy for style-agnostic consistency in diffusion models, bridging the gap with proprietary stylization methods.
Findings
Enhanced stylization consistency in complex scenes
Improved style preservation without degradation
Performance comparable to state-of-the-art proprietary models
Abstract
Diffusion models have advanced image stylization significantly, yet two core challenges persist: (1) maintaining consistent stylization in complex scenes, particularly identity, composition, and fine details, and (2) preventing style degradation in image-to-image pipelines with style LoRAs. GPT-4o's exceptional stylization consistency highlights the performance gap between open-source methods and proprietary models. To bridge this gap, we propose \textbf{OmniConsistency}, a universal consistency plugin leveraging large-scale Diffusion Transformers (DiTs). OmniConsistency contributes: (1) an in-context consistency learning framework trained on aligned image pairs for robust generalization; (2) a two-stage progressive learning strategy decoupling style learning from consistency preservation to mitigate style degradation; and (3) a fully plug-and-play design compatible with arbitrary style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Second Language Acquisition and Learning · Text Readability and Simplification
MethodsDiffusion
