Stochastic gradient method with accelerated stochastic dynamics

Masayuki Ohzeki

arXiv:1511.06036·stat.ML·May 4, 2016

Stochastic gradient method with accelerated stochastic dynamics

Masayuki Ohzeki

PDF

TL;DR

This paper introduces an accelerated stochastic gradient method that violates detailed balance to improve mixing rates and convergence, enhancing Bayesian sampling efficiency in large-scale learning.

Contribution

It proposes a novel stochastic gradient technique that accelerates convergence by violating detailed balance, improving sampling performance in Bayesian methods.

Findings

01

Enhanced mixing rate observed in experiments

02

Reduced correlation time between samples

03

Improved convergence speed in simple models

Abstract

In this paper, we propose a novel technique to implement stochastic gradient methods, which are beneficial for learning from large datasets, through accelerated stochastic dynamics. A stochastic gradient method is based on mini-batch learning for reducing the computational cost when the amount of data is large. The stochasticity of the gradient can be mitigated by the injection of Gaussian noise, which yields the stochastic Langevin gradient method; this method can be used for Bayesian posterior sampling. However, the performance of the stochastic Langevin gradient method depends on the mixing rate of the stochastic dynamics. In this study, we propose violating the detailed balance condition to enhance the mixing rate. Recent studies have revealed that violating the detailed balance condition accelerates the convergence to a stationary state and reduces the correlation time between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.