Constrained Intrinsic Motivation for Reinforcement Learning

Xiang Zheng,Xingjun Ma,Chao Shen,Cong Wang

arXiv:2407.09247·cs.AI·July 15, 2024

Constrained Intrinsic Motivation for Reinforcement Learning

Xiang Zheng,Xingjun Ma,Chao Shen,Cong Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Constrained Intrinsic Motivation (CIM), a novel approach to improve intrinsic motivation in reinforcement learning for reward-free pre-training and exploration tasks, enhancing skill diversity, state coverage, and reward utilization.

Contribution

The paper proposes CIM methods that effectively address static skills and bias issues in intrinsic motivation, advancing unsupervised skill discovery and adaptive reward use in reinforcement learning.

Findings

01

CIM significantly outperforms existing IM methods in skill diversity and state coverage.

02

CIM enhances fine-tuning performance in unsupervised skill discovery.

03

CIM effectively redeems intrinsic rewards when task rewards are available from the start.

Abstract

This paper investigates two fundamental problems that arise when utilizing Intrinsic Motivation (IM) for reinforcement learning in Reward-Free Pre-Training (RFPT) tasks and Exploration with Intrinsic Motivation (EIM) tasks: 1) how to design an effective intrinsic objective in RFPT tasks, and 2) how to reduce the bias introduced by the intrinsic objective in EIM tasks. Existing IM methods suffer from static skills, limited state coverage, sample inefficiency in RFPT tasks, and suboptimality in EIM tasks. To tackle these problems, we propose \emph{Constrained Intrinsic Motivation (CIM)} for RFPT and EIM tasks, respectively: 1) CIM for RFPT maximizes the lower bound of the conditional state entropy subject to an alignment constraint on the state encoder network for efficient dynamic and diverse skill discovery and state coverage maximization; 2) CIM for EIM leverages constrained policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

x-zheng16/cim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics