On the failure of variational score matching for VAE models
Li Kevin Wenliang

TL;DR
This paper critically evaluates variational score matching for VAEs, revealing its failures and showing that only the ELBO-based methods reliably produce good models, supported by theoretical analysis and experiments.
Contribution
The paper provides a theoretical analysis of variational score matching objectives for VAEs, identifying their failures and proposing that ELBO-based methods are more reliable.
Findings
Score matching methods often fail catastrophically on various datasets.
Modified score matching objectives resemble the ELBO and perform reliably.
ELBO and baseline objectives produce consistent, expected results in experiments.
Abstract
Score matching (SM) is a convenient method for training flexible probabilistic models, which is often preferred over the traditional maximum-likelihood (ML) approach. However, these models are less interpretable than normalized models; as such, training robustness is in general difficult to assess. We present a critical study of existing variational SM objectives, showing catastrophic failure on a wide range of datasets and network architectures. Our theoretical insights on the objectives emerge directly from their equivalent autoencoding losses when optimizing variational autoencoder (VAE) models. First, we show that in the Fisher autoencoder, SM produces far worse models than maximum-likelihood, and approximate inference by Fisher divergence can lead to low-density local optima. However, with important modifications, this objective reduces to a regularized autoencoding loss that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
