Masking schemes for universal marginalisers
Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson,, Saurabh Johri

TL;DR
This paper investigates how different masking schemes affect the training and generalization of universal marginalisers, which learn conditional distributions, by comparing structure-agnostic and structure-dependent approaches in a self-supervised setting.
Contribution
It introduces a comparison of masking schemes for training universal marginalisers and analyzes their impact on predictive accuracy and generalization capabilities.
Findings
Structure-dependent masking improves generalization over structure-agnostic schemes.
Training with different masking schemes affects the neural network's ability to learn accurate conditional distributions.
The study provides insights into optimal masking strategies for self-supervised learning of probabilistic models.
Abstract
We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form , where is a given random variable and is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
