What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

Zhengrong Yue; Taihang Hu; Mengting Chen; Haiyu Zhang; Zihao Pan; Tao Liu; Zikang Wang; Jinsong Lan; Xiaoyong Zhu; Bo Zheng; Yali Wang

arXiv:2605.07915·cs.CV·May 11, 2026

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

Zhengrong Yue, Taihang Hu, Mengting Chen, Haiyu Zhang, Zihao Pan, Tao Liu, Zikang Wang, Jinsong Lan, Xiaoyong Zhu, Bo Zheng, Yali Wang

PDF

1 Repo

TL;DR

This paper investigates what makes a latent space suitable for diffusion models, proposing a new autoencoder that explicitly shapes the latent manifold to improve efficiency and quality.

Contribution

It introduces the Prior-Aligned AutoEncoder (PAE), which explicitly organizes the latent manifold using priors and regularization, outperforming existing tokenizers.

Findings

01

PAE achieves state-of-the-art gFID of 1.03 on ImageNet 256x256.

02

PAE converges up to 13x faster than RAE under the same setup.

03

Organizing the latent manifold improves diffusion model performance.

Abstract

Tokenizers are a crucial component of latent diffusion models, as they define the latent space in which diffusion models operate. However, existing tokenizers are primarily designed to improve reconstruction fidelity or inherit pretrained representations, leaving unclear what kind of latent space is truly friendly for generative modeling. In this paper, we study this question from the perspective of latent manifold organization. By constructing controlled tokenizer variants, we identify three key properties of a diffusion-friendly latent manifold: coherent spatial structure, local manifold continuity, and global manifold semantics. We find that these properties are more consistent with downstream generation quality than reconstruction fidelity. Motivated by this finding, we propose the Prior-Aligned AutoEncoder (PAE), which explicitly shapes the latent manifold instead of leaving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhengrongyue/PAE
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.