LogoDiffuser: Training-Free Multilingual Logo Generation and Stylization via Letter-Aware Attention Control

Mingyu Kang; Hyein Seo; Yuna Jeong; Junhyeong Park; and Yong Suk Choi

arXiv:2603.09759·cs.CV·March 11, 2026

LogoDiffuser: Training-Free Multilingual Logo Generation and Stylization via Letter-Aware Attention Control

Mingyu Kang, Hyein Seo, Yuna Jeong, Junhyeong Park, and Yong Suk Choi

PDF

Open Access

TL;DR

LogoDiffuser is a training-free, multilingual logo generation method that uses attention control in diffusion transformers to produce visually appealing logos with accurate character structures across languages.

Contribution

It introduces a novel attention-based approach that integrates character structure and visual design without additional training, supporting multilingual logo synthesis.

Findings

01

Achieves state-of-the-art results in multilingual logo generation

02

Effectively controls character structure across languages

03

Demonstrates robustness without extra training data

Abstract

Recent advances in text-to-image generation have been remarkable, but generating multilingual design logos that harmoniously integrate visual and textual elements remains a challenging task. Existing methods often distort character geometry when applying creative styles and struggle to support multilingual text generation without additional training. To address these challenges, we propose LogoDiffuser, a training-free method that synthesizes multilingual logo designs using the multimodal diffusion transformer. Instead of using textual prompts, we input the target characters as images, enabling robust character structure control regardless of language. We first analyze the joint attention mechanism to identify core tokens, which are tokens that strongly respond to textual structures. With this observation, our method integrates character structure and visual design by injecting the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Data Visualization and Analytics