FonTS: Text Rendering with Typography and Style Controls
Wenda Shi, Yiren Song, Dengming Zhang, Jiaming Liu, Xingxing Zou

TL;DR
This paper introduces a two-stage diffusion transformer pipeline that enhances word-level typographic and style control in text rendering, addressing inconsistencies and limited control in previous methods.
Contribution
It presents a novel typography control fine-tuning method and a style control adapter, along with a new dataset, for improved fine-grained text rendering control.
Findings
Achieves superior word-level typographic control
Enhances font and style consistency in text rendering
Demonstrates effectiveness through comprehensive experiments
Abstract
Visual text rendering are widespread in various real-world applications, requiring careful font selection and typographic choices. Recent progress in diffusion transformer (DiT)-based text-to-image (T2I) models show promise in automating these processes. However, these methods still encounter challenges like inconsistent fonts, style variation, and limited fine-grained control, particularly at the word-level. This paper proposes a two-stage DiT-based pipeline to address these problems by enhancing controllability over typography and style in text rendering. We introduce typography control fine-tuning (TC-FT), an parameter-efficient fine-tuning method (on key parameters) with enclosing typography control tokens (ETC-tokens), which enables precise word-level application of typographic features. To further address style inconsistency in text rendering, we propose a text-agnostic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Digital Humanities and Scholarship · Artificial Intelligence in Games
MethodsDiffusion · Semantic Cross Attention · Adapter
