Variational Inference for Model-Free and Model-Based Reinforcement   Learning

Felix Leibfried

arXiv:2209.01693·cs.LG·December 20, 2022

Variational Inference for Model-Free and Model-Based Reinforcement Learning

Felix Leibfried

PDF

Open Access

TL;DR

This paper explores how variational inference (VI) can unify and enhance reinforcement learning (RL) methods, especially in model-based settings, by framing policy optimization and environment modeling as inference problems.

Contribution

It demonstrates the connection between VI and RL objectives, introducing a regularized VI framework that improves agent performance and clarifies inference in environment modeling.

Findings

01

VI recovers RL optimization objectives under soft policy constraints

02

Regularized VI improves agent performance in RL tasks

03

VI provides a natural framework for environment model learning in RL

Abstract

Variational inference (VI) is a specific type of approximate Bayesian inference that approximates an intractable posterior distribution with a tractable one. VI casts the inference problem as an optimization problem, more specifically, the goal is to maximize a lower bound of the logarithm of the marginal likelihood with respect to the parameters of the approximate posterior. Reinforcement learning (RL) on the other hand deals with autonomous agents and how to make them act optimally such as to maximize some notion of expected future cumulative reward. In the non-sequential setting where agents' actions do not have an impact on future states of the environment, RL is covered by contextual bandits and Bayesian optimization. In a proper sequential scenario, however, where agents' actions affect future states, instantaneous rewards need to be carefully traded off against potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference