R-ParVI: Particle-based variational inference through lens of rewards
Yongchao Huang

TL;DR
R-ParVI introduces a reward-guided, particle-based variational inference method that efficiently samples from complex distributions by guiding particles through a reward mechanism, balancing exploration and convergence.
Contribution
It presents a novel gradient-free particle flow approach for sampling, leveraging rewards to navigate parameter space without requiring explicit gradients.
Findings
Enables fast and scalable sampling for complex probabilistic models
Maintains particle diversity while converging to target distribution
Applicable to Bayesian inference and generative modeling
Abstract
A reward-guided, gradient-free ParVI method, \textit{R-ParVI}, is proposed for sampling partially known densities (e.g. up to a constant). R-ParVI formulates the sampling problem as particle flow driven by rewards: particles are drawn from a prior distribution, navigate through parameter space with movements determined by a reward mechanism blending assessments from the target density, with the steady state particle configuration approximating the target geometry. Particle-environment interactions are simulated by stochastic perturbations and the reward mechanism, which drive particles towards high density regions while maintaining diversity (e.g. preventing from collapsing into clusters). R-ParVI offers fast, flexible, scalable and stochastic sampling and inference for a class of probabilistic models such as those encountered in Bayesian inference and generative modelling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
