Text-Animator: Controllable Visual Text Video Generation

Lin Liu; Quande Liu; Shengju Qian; Yuan Zhou; Wengang Zhou; Houqiang; Li; Lingxi Xie; Qi Tian

arXiv:2406.17777·cs.CV·June 26, 2024

Text-Animator: Controllable Visual Text Video Generation

Lin Liu, Quande Liu, Shengju Qian, Yuan Zhou, Wengang Zhou, Houqiang, Li, Lingxi Xie, Qi Tian

PDF

Open Access

TL;DR

Text-Animator is a novel method for generating videos with visually accurate and stable embedded text, addressing the challenge of visualizing text in T2V generation by controlling camera and text motion.

Contribution

It introduces a text embedding injection, camera control, and text refinement modules to enhance text visualization and stability in generated videos.

Findings

01

Outperforms state-of-the-art methods in visual text accuracy

02

Improves stability of visualized text through camera and motion control

03

Demonstrates superior qualitative and quantitative results

Abstract

Video generation is a challenging yet pivotal task in various industries, such as gaming, e-commerce, and advertising. One significant unresolved aspect within T2V is the effective visualization of text within generated videos. Despite the progress achieved in Text-to-Video~(T2V) generation, current methods still cannot effectively visualize texts in videos directly, as they mainly focus on summarizing semantic scene information, understanding, and depicting actions. While recent advances in image-level visual text generation show promise, transitioning these techniques into the video domain faces problems, notably in preserving textual fidelity and motion coherence. In this paper, we propose an innovative approach termed Text-Animator for visual text video generation. Text-Animator contains a text embedding injection module to precisely depict the structures of visual text in generated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Video Analysis and Summarization · Artificial Intelligence in Games

MethodsFocus