Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity

Zhenglin Wan; Xingrui Yu; David Mark Bossens; Yueming Lyu; Qing Guo; Flint Xiaofeng Fan; Yew Soon Ong; Ivor Tsang

arXiv:2410.06151·cs.LG·August 15, 2025

Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity

Zhenglin Wan, Xingrui Yu, David Mark Bossens, Yueming Lyu, Qing Guo, Flint Xiaofeng Fan, Yew Soon Ong, Ivor Tsang

PDF

Open Access

TL;DR

This paper introduces Extrinsic Behavioral Curiosity (EBC), a novel method that enhances the diversity and robustness of learned behaviors in imitation learning by incentivizing exploration of novel behaviors, significantly improving performance across various locomotion tasks.

Contribution

The paper proposes EBC, a new curiosity-driven approach that boosts behavioral diversity in imitation learning and QD-RL, outperforming existing methods and surpassing expert performance in some cases.

Findings

01

EBC improves QD-IRL performance by up to 185%.

02

EBC surpasses expert performance by 20% in Humanoid tasks.

03

EBC enhances gradient-based QD-RL algorithms.

Abstract

Imitation learning (IL) has shown promise in various applications (e.g. robot locomotion) but is often limited to learning a single expert policy, constraining behavior diversity and robustness in unpredictable real-world scenarios. To address this, we introduce Quality Diversity Inverse Reinforcement Learning (QD-IRL), a novel framework that integrates quality-diversity optimization with IRL methods, enabling agents to learn diverse behaviors from limited demonstrations. This work introduces Extrinsic Behavioral Curiosity (EBC), which allows agents to receive additional curiosity rewards from an external critic based on how novel the behaviors are with respect to a large behavioral archive. To validate the effectiveness of EBC in exploring diverse locomotion behaviors, we evaluate our method on multiple robot locomotion tasks. EBC improves the performance of QD-IRL instances with GAIL,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOpen Education and E-Learning

MethodsGenerative Adversarial Imitation Learning