On Transformations in Stochastic Gradient MCMC

Soma Yokoi; Takuma Otsuka; Issei Sato

arXiv:1903.02750·stat.ML·June 21, 2019·1 cites

On Transformations in Stochastic Gradient MCMC

Soma Yokoi, Takuma Otsuka, Issei Sato

PDF

Open Access

TL;DR

This paper investigates the impact of variable transformations in stochastic gradient MCMC methods, revealing that proper invertible Lipschitz mappings improve sampling accuracy for models with bounded variables.

Contribution

It identifies issues with common mapping approaches in SGLD and proposes an invertible Lipschitz mapping to ensure correct sampling and convergence.

Findings

01

Erroneous samples from common mappings are demonstrated.

02

Invertible Lipschitz mappings improve sampling accuracy.

03

Effective in models like Bayesian non-negative matrix factorization.

Abstract

Stochastic gradient Langevin dynamics (SGLD) is a computationally efficient sampler for Bayesian posterior inference given a large scale dataset. Although SGLD is designed for unbounded random variables, many practical models incorporate variables with boundaries such as non-negative ones or those in a finite interval. To bridge this gap, we consider mapping unbounded samples into the target interval. This paper reveals that several mapping approaches commonly used in the literature produces erroneous samples from theoretical and empirical perspectives. We show that the change of random variable using an invertible Lipschitz mapping function overcomes the pitfall as well as attains the weak convergence. Experiments demonstrate its efficacy for widely-used models with bounded latent variables including Bayesian non-negative matrix factorization and binary neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Advanced Memory and Neural Computing · Stochastic Gradient Optimization Techniques