Concurrent Credit Assignment for Data-efficient Reinforcement Learning

Emmanuel Dauc\'e

arXiv:2205.12020·cs.LG·May 25, 2022·1 cites

Concurrent Credit Assignment for Data-efficient Reinforcement Learning

Emmanuel Dauc\'e

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel reinforcement learning approach that uses a variational occupancy model to improve exploration efficiency, leading to faster training and higher returns in continuous action tasks.

Contribution

It proposes a concurrent credit assignment method leveraging a variational occupancy model to enhance data efficiency in reinforcement learning.

Findings

01

Significant reduction in training time.

02

Higher returns in continuous control benchmarks.

03

Effective in both dense and sparse reward settings.

Abstract

The capability to widely sample the state and action spaces is a key ingredient toward building effective reinforcement learning algorithms. The variational optimization principles exposed in this paper emphasize the importance of an occupancy model to synthesizes the general distribution of the agent's environmental states over which it can act (defining a virtual ``territory''). The occupancy model is the subject of frequent updates as the exploration progresses and that new states are undisclosed during the course of the training. By making a uniform prior assumption, the resulting objective expresses a balance between two concurrent tendencies, namely the widening of the occupancy space and the maximization of the rewards, reminding of the classical exploration/exploitation trade-off. Implemented on an actor-critic off-policy on classic continuous action benchmarks, it is shown to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

edauce/ijcnn-cca
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Auction Theory and Applications · Adaptive Dynamic Programming Control