Text to Sketch Generation with Multi-Styles

Tengjie Li; Shikui Tu; Lei Xu

arXiv:2511.04123·cs.CV·November 7, 2025

Text to Sketch Generation with Multi-Styles

Tengjie Li, Shikui Tu, Lei Xu

PDF

Open Access

TL;DR

This paper introduces a training-free diffusion-based framework for sketch generation that allows explicit style control through textual prompts and reference sketches, improving style accuracy and flexibility.

Contribution

It presents a novel style guidance mechanism that reduces content leakage and supports multi-style generation using a joint AdaIN module, advancing style control in sketch synthesis.

Findings

01

High-quality style-aligned sketch generation

02

Effective reduction of content leakage from references

03

Supports multi-style generation with improved flexibility

Abstract

Recent advances in vision-language models have facilitated progress in sketch generation. However, existing specialized methods primarily focus on generic synthesis and lack mechanisms for precise control over sketch styles. In this work, we propose a training-free framework based on diffusion models that enables explicit style guidance via textual prompts and referenced style sketches. Unlike previous style transfer methods that overwrite key and value matrices in self-attention, we incorporate the reference features as auxiliary information with linear smoothing and leverage a style-content guidance mechanism. This design effectively reduces content leakage from reference sketches and enhances synthesis quality, especially in cases with low structural similarity between reference and target sketches. Furthermore, we extend our framework to support controllable multi-style generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · 3D Shape Modeling and Analysis