Conjugated Discrete Distributions for Distributional Reinforcement Learning
Bj\"orn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

TL;DR
This paper introduces conjugated distributional operators in reinforcement learning to handle unaltered rewards effectively, demonstrating state-of-the-art results on Atari games with theoretical guarantees of convergence.
Contribution
It proposes a novel conjugated distributional operator and an associated algorithm that directly trains on unaltered rewards, improving upon existing reward transformation methods.
Findings
Achieved state-of-the-art performance on 55 Atari games.
Provided theoretical convergence guarantees for the proposed method.
Effectively handles large reward magnitudes without clipping or transformations.
Abstract
In this work we continue to build upon recent advances in reinforcement learning for finite Markov processes. A common approach among previous existing algorithms, both single-actor and distributed, is to either clip rewards or to apply a transformation method on Q-functions to handle a large variety of magnitudes in real discounted returns. We theoretically show that one of the most successful methods may not yield an optimal policy if we have a non-deterministic process. As a solution, we argue that distributional reinforcement learning lends itself to remedy this situation completely. By the introduction of a conjugated distributional operator we may handle a large class of transformations for real returns with guaranteed theoretical convergence. We propose an approximating single-actor algorithm based on this operator that trains agents directly on unaltered rewards using a proper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Game Theory and Applications · Auction Theory and Applications
