Improved Baselines with Representation Autoencoders

Jaskirat Singh; Boyang Zheng; Zongze Wu; Richard Zhang; Eli Shechtman; Saining Xie

arXiv:2605.18324·cs.CV·May 19, 2026

Improved Baselines with Representation Autoencoders

Jaskirat Singh, Boyang Zheng, Zongze Wu, Richard Zhang, Eli Shechtman, Saining Xie

PDF

1 Repo

TL;DR

This paper introduces RAEv2, an improved version of Representation Autoencoders that leverages new design choices, leading to faster training, better image generation quality, and broader applicability across tasks.

Contribution

The paper systematically investigates design choices in RAE, introduces RAEv2 with key improvements, and demonstrates significant speed and quality enhancements in generative modeling.

Findings

01

RAEv2 achieves over 10x faster convergence than original RAE.

02

RAEv2 attains a state-of-the-art gFID of 1.06 in 80 epochs on ImageNet-256.

03

RAEv2 outperforms previous methods on FDr^k with a score of 2.17 at 80 epochs.

Abstract

Representation Autoencoders (RAE) replace traditional VAE with pretrained vision encoders. In this paper, we systematically investigate several design choices and find three insights which simplify and improve RAE. First, we study a generalized formulation where the representation is defined as sum of the last k encoder layers rather than solely the final layer. This simple change greatly improves reconstruction without encoder finetuning or specialized data (e.g., text, faces). Second, we study the prevalent assumption that RAE (using pretrained representation as encoder) replaces representation alignment (REPA), which distills the same representation to intermediate layers instead. Through large-scale empirical analysis, we uncover a surprising finding: RAE and REPA exhibit complementary working mechanisms, allowing the same representation to be used as both encoder and target for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://raev2.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.