TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Teng Li; Ziyuan Huang; Cong Chen; Yangfu Li; Yuanhuiyi Lyu; Dandan Zheng; Chunhua Shen; Jun Zhang

arXiv:2604.07340·cs.CV·April 9, 2026

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Teng Li, Ziyuan Huang, Cong Chen, Yangfu Li, Yuanhuiyi Lyu, Dandan Zheng, Chunhua Shen, Jun Zhang

PDF

1 Repo 1 Models

TL;DR

TC-AE introduces a ViT-based deep compression autoencoder that improves reconstruction and generation by addressing token-to-latent compression challenges and enhancing semantic token structure.

Contribution

The paper presents a novel ViT-based autoencoder architecture that decomposes token-to-latent compression and uses joint self-supervised training to prevent latent collapse.

Findings

01

Achieves better reconstruction quality under high compression ratios.

02

Enhances generative performance with semantic token structure.

03

Addresses token-to-latent compression limitations effectively.

Abstract

We propose TC-AE, a ViT-based architecture for deep compression autoencoders. Existing methods commonly increase the channel number of latent representations to maintain reconstruction quality under high compression ratios. However, this strategy often leads to latent representation collapse, which degrades generative performance. Instead of relying on increasingly complex architectures or multi-stage training schemes, TC-AE addresses this challenge from the perspective of the token space, the key bridge between pixels and image latents, through two complementary innovations: Firstly, we study token number scaling by adjusting the patch size in ViT under a fixed latent budget, and identify aggressive token-to-latent compression as the key factor that limits effective scaling. To address this issue, we decompose token-to-latent compression into two stages, reducing structural information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inclusionai/TC-AE
github

Models

🤗
inclusionAI/TC-AE
model· ♡ 13
♡ 13

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.