Direct Optimization through $\arg \max$ for Discrete Variational   Auto-Encoder

Guy Lorberbom (Technion); Andreea Gane (MIT); Tommi Jaakkola (MIT),; Tamir Hazan (Technion)

arXiv:1806.02867·cs.LG·December 10, 2019·6 cites

Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder

Guy Lorberbom (Technion), Andreea Gane (MIT), Tommi Jaakkola (MIT),, Tamir Hazan (Technion)

PDF

Open Access 2 Repos

TL;DR

This paper introduces a direct optimization method for discrete variational auto-encoders that bypasses softmax relaxations, enabling effective training of models with structured discrete latent variables.

Contribution

It proposes a novel direct loss minimization approach for the $ ext{arg} ext{max}$ objective in discrete VAEs, extending to structured latent variables.

Findings

01

Effective training of discrete VAEs demonstrated

02

Outperforms softmax relaxation methods

03

Applicable to structured discrete latent models

Abstract

Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates. In the discrete case, one can perform reparametrization using the Gumbel-Max trick, but the resulting objective relies on an $ar g max$ operation and is non-differentiable. In contrast to previous works which resort to softmax-based relaxations, we propose to optimize it directly by applying the direct loss minimization approach. Our proposal extends naturally to structured discrete latent variable models when evaluating the $ar g max$ operation is tractable. We demonstrate empirically the effectiveness of the direct loss minimization technique in variational autoencoders with both unstructured and structured discrete latent variables.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Domain Adaptation and Few-Shot Learning