R-ParVI: Particle-based variational inference through lens of rewards

Yongchao Huang

arXiv:2502.20482·cs.AI·March 3, 2025

R-ParVI: Particle-based variational inference through lens of rewards

Yongchao Huang

PDF

TL;DR

R-ParVI introduces a reward-guided, particle-based variational inference method that efficiently samples from complex distributions by guiding particles through a reward mechanism, balancing exploration and convergence.

Contribution

It presents a novel gradient-free particle flow approach for sampling, leveraging rewards to navigate parameter space without requiring explicit gradients.

Findings

01

Enables fast and scalable sampling for complex probabilistic models

02

Maintains particle diversity while converging to target distribution

03

Applicable to Bayesian inference and generative modeling

Abstract

A reward-guided, gradient-free ParVI method, \textit{R-ParVI}, is proposed for sampling partially known densities (e.g. up to a constant). R-ParVI formulates the sampling problem as particle flow driven by rewards: particles are drawn from a prior distribution, navigate through parameter space with movements determined by a reward mechanism blending assessments from the target density, with the steady state particle configuration approximating the target geometry. Particle-environment interactions are simulated by stochastic perturbations and the reward mechanism, which drive particles towards high density regions while maintaining diversity (e.g. preventing from collapsing into clusters). R-ParVI offers fast, flexible, scalable and stochastic sampling and inference for a class of probabilistic models such as those encountered in Bayesian inference and generative modelling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.