Disentangled representations via score-based variational autoencoders

Benjamin S. H. Lyo; Eero P. Simoncelli; Cristina Savin

arXiv:2512.17127·stat.ML·December 23, 2025

Disentangled representations via score-based variational autoencoders

Benjamin S. H. Lyo, Eero P. Simoncelli, Cristina Savin

PDF

Open Access

TL;DR

This paper introduces SAMI, a novel unsupervised learning method that combines diffusion models and VAEs to learn interpretable, structured representations from complex data, including images and videos.

Contribution

SAMI unifies diffusion models and VAEs into a single framework, enabling the extraction of meaningful, disentangled representations without supervision.

Findings

01

Recovers ground truth generative factors in synthetic data

02

Learns factorized, semantic latent dimensions from natural images

03

Encodes video sequences into straighter latent trajectories

Abstract

We present the Score-based Autoencoder for Multiscale Inference (SAMI), a method for unsupervised representation learning that combines the theoretical frameworks of diffusion models and VAEs. By unifying their respective evidence lower bounds, SAMI formulates a principled objective that learns representations through score-based guidance of the underlying diffusion process. The resulting representations automatically capture meaningful structure in the data: it recovers ground truth generative factors in our synthetic dataset, learns factorized, semantic latent dimensions from complex natural images, and encodes video sequences into latent trajectories that are straighter than those of alternative encoders, despite training exclusively on static images. Furthermore, SAMI can extract useful representations from pre-trained diffusion models with minimal additional training. Finally, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Advanced Neuroimaging Techniques and Applications