Disentanglement with Factor Quantized Variational Autoencoders

Gulcin Baykal; Melih Kandemir; Gozde Unal

arXiv:2409.14851·cs.CV·November 6, 2025

Disentanglement with Factor Quantized Variational Autoencoders

Gulcin Baykal, Melih Kandemir, Gozde Unal

PDF

Open Access 1 Repo

TL;DR

This paper introduces FactorQVAE, a discrete variational autoencoder that learns disentangled representations without ground truth factors, using scalar quantization and total correlation to improve disentanglement and reconstruction.

Contribution

The work proposes a novel discrete VAE model with scalar quantization and an inductive bias to enhance disentanglement without requiring known generative factors.

Findings

01

Outperforms existing disentanglement methods on DCI and InfoMEC metrics

02

Improves reconstruction quality compared to prior approaches

03

Demonstrates the effectiveness of discrete representations in disentanglement

Abstract

Disentangled representation learning aims to represent the underlying generative factors of a dataset in a latent representation independently of one another. In our work, we propose a discrete variational autoencoder (VAE) based model where the ground truth information about the generative factors are not provided to the model. We demonstrate the advantages of learning discrete representations over learning continuous representations in facilitating disentanglement. Furthermore, we propose incorporating an inductive bias into the model to further enhance disentanglement. Precisely, we propose scalar quantization of the latent variables in a latent representation with scalar values from a global codebook, and we add a total correlation term to the optimization as an inductive bias. Our method called FactorQVAE combines optimization based disentanglement approaches with discrete…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ituvisionlab/factorqvae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Signal Denoising Methods