On solutions of the distributional Bellman equation
Julian Gerstenberg, Ralph Neininger, Denis Spiegel

TL;DR
This paper investigates the mathematical properties of distributional Bellman equations in reinforcement learning, focusing on existence, uniqueness, tail behavior, and their relation to affine equations, thereby advancing theoretical understanding.
Contribution
It provides necessary and sufficient conditions for solutions of distributional Bellman equations and links them to multivariate affine equations, broadening the theoretical framework.
Findings
Established conditions for existence and uniqueness of solutions
Identified cases of regular variation in return distributions
Connected distributional Bellman equations to affine distributional equations
Abstract
In distributional reinforcement learning not only expected returns but the complete return distributions of a policy are taken into account. The return distribution for a fixed policy is given as the solution of an associated distributional Bellman equation. In this note we consider general distributional Bellman equations and study existence and uniqueness of their solutions as well as tail properties of return distributions. We give necessary and sufficient conditions for existence and uniqueness of return distributions and identify cases of regular variation. We link distributional Bellman equations to multivariate affine distributional equations. We show that any solution of a distributional Bellman equation can be obtained as the vector of marginal laws of a solution to a multivariate affine distributional equation. This makes the general theory of such equations applicable to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Sports Analytics and Performance
