Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement   Learning

Adriana Hugessen; Roger Creus Castanyer; Faisal Mohamed; Glen Berseth

arXiv:2405.17243·cs.LG·August 19, 2024

Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning

Adriana Hugessen, Roger Creus Castanyer, Faisal Mohamed, Glen Berseth

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reinforcement learning agent that adaptively switches between entropy-maximizing and entropy-minimizing objectives based on environmental conditions, leading to robust emergent behaviors across diverse settings.

Contribution

The authors propose a novel adaptive intrinsic motivation mechanism that dynamically adjusts the entropy objective using a bandit framework, enabling consistent learning across environments.

Findings

01

Agents can control environmental entropy levels.

02

Emergent behaviors are observed in both high- and low-entropy regimes.

03

Agents learn skillful behaviors in benchmark tasks.

Abstract

Both entropy-minimizing and entropy-maximizing (curiosity) objectives for unsupervised reinforcement learning (RL) have been shown to be effective in different environments, depending on the environment's level of natural entropy. However, neither method alone results in an agent that will consistently learn intelligent behavior across environments. In an effort to find a single entropy-based method that will encourage emergent behaviors in any environment, we propose an agent that can adapt its objective online, depending on the entropy conditions by framing the choice as a multi-armed bandit problem. We devise a novel intrinsic feedback signal for the bandit, which captures the agent's ability to control the entropy in its environment. We demonstrate that such agents can learn to control entropy and exhibit emergent behaviors in both high- and low-entropy regimes and can learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

roger-creus/surprise-adaptive-agents
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Parking Systems Research · Transportation and Mobility Innovations · Reinforcement Learning in Robotics