Masking schemes for universal marginalisers

Divya Gautam; Maria Lomeli; Kostis Gourgoulias; Daniel H. Thompson,; Saurabh Johri

arXiv:2001.05895·cs.LG·January 17, 2020·1 cites

Masking schemes for universal marginalisers

Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson,, Saurabh Johri

PDF

Open Access

TL;DR

This paper investigates how different masking schemes affect the training and generalization of universal marginalisers, which learn conditional distributions, by comparing structure-agnostic and structure-dependent approaches in a self-supervised setting.

Contribution

It introduces a comparison of masking schemes for training universal marginalisers and analyzes their impact on predictive accuracy and generalization capabilities.

Findings

01

Structure-dependent masking improves generalization over structure-agnostic schemes.

02

Training with different masking schemes affects the neural network's ability to learn accurate conditional distributions.

03

The study provides insights into optimal masking strategies for self-supervised learning of probabilistic models.

Abstract

We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P (x_{i} ∣ x_{b})$ , where $x_{i}$ is a given random variable and $x_{b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference