WordCraft: Interactive Artistic Typography with Attention Awareness and Noise Blending
Zhe Wang, Jingbo Zhang, Tianyi Wei, Wanchao Su, Can Wang

TL;DR
WordCraft is an interactive typography system that combines diffusion models, regional attention, noise blending, and language understanding to enable flexible, high-quality, multi-character stylized text creation with user-driven refinement.
Contribution
It introduces a training-free regional attention mechanism and noise blending for interactive, multi-region typography generation, integrated with language models for flexible prompt interpretation.
Findings
Supports multi-language, multi-character stylization
Enables iterative refinement without quality loss
Enhances interactivity for creative typography design
Abstract
Artistic typography aims to stylize input characters with visual effects that are both creative and legible. Traditional approaches rely heavily on manual design, while recent generative models, particularly diffusion-based methods, have enabled automated character stylization. However, existing solutions remain limited in interactivity, lacking support for localized edits, iterative refinement, multi-character composition, and open-ended prompt interpretation. We introduce WordCraft, an interactive artistic typography system that integrates diffusion models to address these limitations. WordCraft features a training-free regional attention mechanism for precise, multi-region generation and a noise blending that supports continuous refinement without compromising visual quality. To support flexible, intent-driven generation, we incorporate a large language model to parse and structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media and Visual Art
