Regularising Deep Networks with Deep Generative Models
Matthew Willetts, Alexander Camuto, Stephen Roberts, Chris Holmes

TL;DR
This paper introduces a novel regularisation technique for neural networks that models activation distributions and imputes values during training, improving accuracy and uncertainty calibration on image classification tasks.
Contribution
It generalizes data augmentation to hidden layers using deep generative models, enhancing regularisation and model calibration.
Findings
Higher test accuracy on CIFAR-10 and SVHN
Lower test-set cross-entropy compared to baselines
Better calibrated uncertainty over class posteriors
Abstract
We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an arbitrary subset of activations conditioned on the remainder. This is a generalisation of data augmentation to the hidden layers of a network, and a form of data-aware dropout. We demonstrate that our training method leads to higher test accuracy and lower test-set cross-entropy for neural networks trained on CIFAR-10 and SVHN compared to standard regularisation baselines: our approach leads to networks with better calibrated uncertainty over the class posteriors all the while delivering greater test-set accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
