Grammar Variational Autoencoder
Matt J. Kusner, Brooks Paige, Jos\'e Miguel Hern\'andez-Lobato

TL;DR
This paper introduces a grammar-based variational autoencoder that ensures valid discrete data generation, such as expressions and molecules, by encoding and decoding parse trees, leading to more coherent latent spaces and improved optimization performance.
Contribution
It presents a novel grammar variational autoencoder that guarantees validity of generated discrete data and enhances latent space coherence compared to existing methods.
Findings
Higher validity in generated outputs
More coherent latent space representations
Improved performance in Bayesian optimization tasks
Abstract
Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which encodes and decodes directly to and from these parse trees, ensuring the generated outputs are always valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Grammar Variational Autoencoder· youtube
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Machine Learning in Materials Science
MethodsSolana Customer Service Number +1-833-534-1729
