Tuning-Free Coreset Markov Chain Monte Carlo via Hot DoG

Naitong Chen; Jonathan H. Huggins; Trevor Campbell

arXiv:2410.18973·stat.CO·June 23, 2025

Tuning-Free Coreset Markov Chain Monte Carlo via Hot DoG

Naitong Chen, Jonathan H. Huggins, Trevor Campbell

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hot DoG, a learning-rate-free stochastic gradient method for training Bayesian coresets in MCMC, eliminating the need for manual tuning and improving posterior approximation quality.

Contribution

It proposes Hot DoG, a novel learning-rate-free optimization method for coreset training in MCMC, with theoretical convergence guarantees and empirical superiority.

Findings

01

Hot DoG outperforms other learning-rate-free methods in posterior quality.

02

Hot DoG performs comparably to optimally-tuned ADAM.

03

Theoretical convergence of Hot DoG is established.

Abstract

A Bayesian coreset is a small, weighted subset of a data set that replaces the full data during inference to reduce computational cost. The state-of-the-art coreset construction algorithm, Coreset Markov chain Monte Carlo (Coreset MCMC), uses draws from an adaptive Markov chain targeting the coreset posterior to train the coreset weights via stochastic gradient optimization. However, the quality of the constructed coreset, and thus the quality of its posterior approximation, is sensitive to the stochastic optimization learning rate. In this work, we propose a learning-rate-free stochastic gradient optimization procedure, Hot-start Distance over Gradient (Hot DoG), for training coreset weights in Coreset MCMC without user tuning effort. We provide a theoretical analysis of the convergence of the coreset weights produced by Hot DoG. We also provide empirical results demonstrate that Hot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NaitongChen/automated-coreset-mcmc-experiments
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods

MethodsSparse Evolutionary Training