Sparse Autoencoders, Again?

Yin Lu; Xuening Zhu; Tong He; David Wipf

arXiv:2506.04859·cs.LG·June 9, 2025

Sparse Autoencoders, Again?

Yin Lu, Xuening Zhu, Tong He, David Wipf

PDF

Open Access

TL;DR

This paper identifies limitations in traditional sparse autoencoders and variational autoencoders, proposing a hybrid model that better captures data structure, yields sparser representations, and outperforms existing models in various domains.

Contribution

The authors introduce a hybrid autoencoder model that overcomes weaknesses of canonical SAEs and VAEs, with theoretical guarantees and improved empirical performance.

Findings

01

The hybrid model recovers structured data across multiple manifolds.

02

It produces sparser latent representations without losing reconstruction quality.

03

It outperforms traditional SAEs, VAEs, and recent diffusion models in experiments.

Abstract

Is there really much more to say about sparse autoencoders (SAEs)? Autoencoders in general, and SAEs in particular, represent deep architectures that are capable of modeling low-dimensional latent structure in data. Such structure could reflect, among other things, correlation patterns in large language model activations, or complex natural image manifolds. And yet despite the wide-ranging applicability, there have been relatively few changes to SAEs beyond the original recipe from decades ago, namely, standard deep encoder/decoder layers trained with a classical/deterministic sparse regularizer applied within the latent space. One possible exception is the variational autoencoder (VAE), which adopts a stochastic encoder module capable of producing sparse representations when applied to manifold data. In this work we formalize underappreciated weaknesses with both canonical SAEs, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning

MethodsDiffusion