Posterior Approximation using Stochastic Gradient Ascent with Adaptive   Stepsize

Kart-Leong Lim; Xudong Jiang

arXiv:2412.08951·cs.LG·February 25, 2025

Posterior Approximation using Stochastic Gradient Ascent with Adaptive Stepsize

Kart-Leong Lim, Xudong Jiang

PDF

Open Access

TL;DR

This paper introduces an adaptive stepsize stochastic gradient ascent method for efficient posterior approximation in Bayesian nonparametrics, demonstrating comparable performance to traditional methods on large-scale datasets.

Contribution

It develops an adaptive stepsize stochastic gradient ascent algorithm for posterior approximation, integrating Fisher information for improved speed and scalability.

Findings

01

Achieves comparable accuracy to coordinate ascent methods.

02

Scales effectively to large datasets like Caltech256 and SUN397.

03

Compatible with deep convolutional neural network features.

Abstract

Scalable algorithms of posterior approximation allow Bayesian nonparametrics such as Dirichlet process mixture to scale up to larger dataset at fractional cost. Recent algorithms, notably the stochastic variational inference performs local learning from minibatch. The main problem with stochastic variational inference is that it relies on closed form solution. Stochastic gradient ascent is a modern approach to machine learning and is widely deployed in the training of deep neural networks. In this work, we explore using stochastic gradient ascent as a fast algorithm for the posterior approximation of Dirichlet process mixture. However, stochastic gradient ascent alone is not optimal for learning. In order to achieve both speed and performance, we turn our focus to stepsize optimization in stochastic gradient ascent. As as intermediate approach, we first optimize stepsize using the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning

MethodsAverage Pooling · Global Average Pooling · Kaiming Initialization · Adam · Variational Inference · Convolution · Max Pooling