AnyText2: Visual Text Generation and Editing With Customizable   Attributes

Yuxiang Tuo; Yifeng Geng; Liefeng Bo

arXiv:2411.15245·cs.CV·November 26, 2024

AnyText2: Visual Text Generation and Editing With Customizable Attributes

Yuxiang Tuo, Yifeng Geng, Liefeng Bo

PDF

Open Access 1 Repo

TL;DR

AnyText2 advances text-to-image generation by enabling precise control over multilingual text attributes like font and color, improving realism, speed, and accuracy in scene image editing.

Contribution

It introduces a novel architecture and techniques for controlling text attributes in scene images, enhancing realism and accuracy over previous methods.

Findings

01

19.8% faster inference speed

02

3.3% and 9.3% improvements in text accuracy for Chinese and English

03

State-of-the-art performance demonstrated

Abstract

As the text-to-image (T2I) domain progresses, generating text that seamlessly integrates with visual content has garnered significant attention. However, even with accurate text generation, the inability to control font and color can greatly limit certain applications, and this issue remains insufficiently addressed. This paper introduces AnyText2, a novel method that enables precise control over multilingual text attributes in natural scene image generation and editing. Our approach consists of two main components. First, we propose a WriteNet+AttnX architecture that injects text rendering capabilities into a pre-trained T2I model. Compared to its predecessor, AnyText, our new approach not only enhances image realism but also achieves a 19.8% increase in inference speed. Second, we explore techniques for extracting fonts and colors from scene images and develop a Text Embedding Module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tyxsspa/anytext2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · 3D Modeling in Geospatial Applications · Augmented Reality Applications