L2P: Unlocking Latent Potential for Pixel Generation

Zhennan Chen; Junwei Zhu; Xu Chen; Jiangning Zhang; Jiawei Chen; Zhuoqi Zeng; Wei Zhang; Chengjie Wang; Jian Yang; Ying Tai

arXiv:2605.12013·cs.CV·May 13, 2026

L2P: Unlocking Latent Potential for Pixel Generation

Zhennan Chen, Junwei Zhu, Xu Chen, Jiangning Zhang, Jiawei Chen, Zhuoqi Zeng, Wei Zhang, Chengjie Wang, Jian Yang, Ying Tai

PDF

1 Repo 2 Models 1 Datasets

TL;DR

L2P introduces an efficient transfer framework that leverages pre-trained latent diffusion models to generate high-resolution pixel images with minimal training resources.

Contribution

The paper proposes L2P, a novel transfer paradigm that bypasses VAE training, enabling rapid, high-quality pixel generation from latent models using synthetic data and shallow training.

Findings

01

L2P achieves comparable performance to source LDMs on benchmark tasks.

02

L2P enables 4K ultra-high resolution image generation.

03

Training overhead is negligible compared to traditional methods.

Abstract

Pixel diffusion models have recently regained attention for visual generation. However, training advanced pixel-space models from scratch demands prohibitive computational and data resources. To address this, we propose the Latent-to-Pixel (L2P) transfer paradigm, an efficient framework that directly harnesses the rich knowledge of pre-trained LDMs to build powerful pixel-space models. Specifically, L2P discards the VAE in favor of large-patch tokenization and freezes the source LDM's intermediate layers, exclusively training shallow layers to learn the latent-to-pixel transformation. By utilizing LDM-generated synthetic images as the sole training corpus, L2P fits an already smooth data manifold, enabling rapid convergence with zero real-data collection. This strategy allows L2P to seamlessly migrate massive latent priors to the pixel space using only 8 GPUs. Furthermore, eliminating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tencentyouturesearch/T2I-L2P
github

Models

Datasets

zhen-nan/L2P-dataset
dataset· 14 dl
14 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.