Increasing the Generalisation Capacity of Conditional VAEs

Alexej Klushyn; Nutan Chen; Botond Cseke; Justin Bayer; Patrick van; der Smagt

arXiv:1908.08750·stat.ML·September 11, 2019

Increasing the Generalisation Capacity of Conditional VAEs

Alexej Klushyn, Nutan Chen, Botond Cseke, Justin Bayer, Patrick van, der Smagt

PDF

Open Access

TL;DR

This paper enhances the generalisation ability of conditional variational autoencoders by modifying the latent space and prior, leading to better performance on structured prediction tasks with diverse solutions.

Contribution

It introduces a new approach to incentivise informative latent representations and employs a multimodal prior to improve generalisation in conditional VAEs.

Findings

01

Higher generalisation capability demonstrated on multiple datasets

02

Significant improvement over baseline models

03

Effective capturing of semantically meaningful features

Abstract

We address the problem of one-to-many mappings in supervised learning, where a single instance has many different solutions of possibly equal cost. The framework of conditional variational autoencoders describes a class of methods to tackle such structured-prediction tasks by means of latent variables. We propose to incentivise informative latent representations for increasing the generalisation capacity of conditional variational autoencoders. To this end, we modify the latent variable model by defining the likelihood as a function of the latent variable only and introduce an expressive multimodal prior to enable the model for capturing semantically meaningful features of the data. To validate our approach, we train our model on the Cornell Robot Grasping dataset, and modified versions of MNIST and Fashion-MNIST obtaining results that show a significantly higher generalisation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning