Conjugated Discrete Distributions for Distributional Reinforcement   Learning

Bj\"orn Lindenberg; Jonas Nordqvist; Karl-Olof Lindahl

arXiv:2112.07424·cs.LG·December 15, 2021

Conjugated Discrete Distributions for Distributional Reinforcement Learning

Bj\"orn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces conjugated distributional operators in reinforcement learning to handle unaltered rewards effectively, demonstrating state-of-the-art results on Atari games with theoretical guarantees of convergence.

Contribution

It proposes a novel conjugated distributional operator and an associated algorithm that directly trains on unaltered rewards, improving upon existing reward transformation methods.

Findings

01

Achieved state-of-the-art performance on 55 Atari games.

02

Provided theoretical convergence guarantees for the proposed method.

03

Effectively handles large reward magnitudes without clipping or transformations.

Abstract

In this work we continue to build upon recent advances in reinforcement learning for finite Markov processes. A common approach among previous existing algorithms, both single-actor and distributed, is to either clip rewards or to apply a transformation method on Q-functions to handle a large variety of magnitudes in real discounted returns. We theoretically show that one of the most successful methods may not yield an optimal policy if we have a non-deterministic process. As a solution, we argue that distributional reinforcement learning lends itself to remedy this situation completely. By the introduction of a conjugated distributional operator we may handle a large class of transformations for real returns with guaranteed theoretical convergence. We propose an approximating single-actor algorithm based on this operator that trains agents directly on unaltered rewards using a proper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bjliaa/c2d
tfOfficial

Videos

Conjugated Discrete Distributions for Distributional Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Game Theory and Applications · Auction Theory and Applications