Isoperimetry is All We Need: Langevin Posterior Sampling for RL with Sublinear Regret
Emilio Jorge, Christos Dimitrakakis, Debabrota Basu

TL;DR
This paper introduces a Langevin sampling-based RL algorithm, LaPSRL, that achieves sublinear regret under isoperimetric distribution assumptions, extending RL theory beyond traditional Gaussian or log-concave models.
Contribution
The paper develops LaPSRL, a novel Langevin sampling-based RL algorithm with order-optimal regret for isoperimetric distributions, broadening applicability beyond classical assumptions.
Findings
LaPSRL achieves sublinear regret under Log-Sobolev Inequality distributions.
Experimental results show LaPSRL's competitive performance across various environments.
LaPSRL has subquadratic complexity per episode, making it practical for large-scale problems.
Abstract
Common assumptions, like linear or RKHS models, and Gaussian or log-concave posteriors over the models, do not explain practical success of RL across a wider range of distributions and models. Thus, we study how to design RL algorithms with sublinear regret for isoperimetric distributions, specifically the ones satisfying the Log-Sobolev Inequality (LSI). LSI distributions include the standard setups of RL theory, and others, such as many non-log-concave and perturbed distributions. First, we show that the Posterior Sampling-based RL (PSRL) algorithm yields sublinear regret if the data distributions satisfy LSI and some mild additional assumptions. Also, when we cannot compute or sample from an exact posterior, we propose a Langevin sampling-based algorithm design: LaPSRL. We show that LaPSRL achieves order-optimal regret and subquadratic complexity per episode. Finally, we deploy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Gene expression and cancer classification · Advanced MRI Techniques and Applications
