TransText: Alpha-as-RGB Representation for Transparent Text Animation

Fei Zhang; Zijian Zhou; Bohao Tang; Sen He; Hang Li; Zhe Wang; Soubhik Sanyal; Pengfei Liu; Viktar Atliha; Tao Xiang; Frost Xu; Semih Gunel

arXiv:2603.17944·cs.CV·March 20, 2026

TransText: Alpha-as-RGB Representation for Transparent Text Animation

Fei Zhang, Zijian Zhou, Bohao Tang, Sen He, Hang Li, Zhe Wang, Soubhik Sanyal, Pengfei Liu, Viktar Atliha, Tao Xiang, Frost Xu, Semih Gunel

PDF

Open Access

TL;DR

TransText introduces a novel Alpha-as-RGB approach for transparent text animation, enabling high-quality, layer-aware glyph animations without retraining large models, thus improving efficiency and visual fidelity.

Contribution

It proposes the first Alpha-as-RGB paradigm for transparent text animation, avoiding retraining of VAE models and ensuring cross-modal consistency.

Findings

01

TransText outperforms existing methods in generating coherent transparent animations.

02

The approach maintains semantic priors while modeling transparency.

03

It achieves diverse, fine-grained visual effects in text animation.

Abstract

We introduce the first method, to the best of our knowledge, for adapting image-to-video models to layer-aware text (glyph) animation, a capability critical for practical dynamic visual design. Existing approaches predominantly handle the transparency-encoding (alpha channel) as an extra latent dimension appended to the RGB space, necessitating the reconstruction of the underlying RGB-centric variational autoencoder (VAE). However, given the scarcity of high-quality transparent glyph data, retraining the VAE is computationally expensive and may erode the robust semantic priors learned from massive RGB corpora, potentially leading to latent pattern mixing. To mitigate these limitations, we propose TransText, a framework based on a novel Alpha-as-RGB paradigm to jointly model appearance and transparency without modifying the pre-trained generative manifold. TransText embeds the alpha…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Face recognition and analysis