D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning

Caroline Wang; Garrett Warnell; Peter Stone

arXiv:2210.14428·cs.LG·August 19, 2025·1 cites

D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning

Caroline Wang, Garrett Warnell, Peter Stone

PDF

Open Access

TL;DR

D-Shape is a novel method that integrates imitation learning and reinforcement learning, enabling effective learning from suboptimal demonstrations and achieving optimal policies efficiently.

Contribution

It introduces a goal-conditioned RL approach with reward shaping to reconcile IL and RL objectives, especially with suboptimal demonstrations.

Findings

01

Improves sample efficiency over pure RL.

02

Converges to optimal policy with suboptimal demonstrations.

03

Effective in sparse-reward gridworld domains.

Abstract

While combining imitation learning (IL) and reinforcement learning (RL) is a promising way to address poor sample efficiency in autonomous behavior acquisition, methods that do so typically assume that the requisite behavior demonstrations are provided by an expert that behaves optimally with respect to a task reward. If, however, suboptimal demonstrations are provided, a fundamental challenge appears in that the demonstration-matching objective of IL conflicts with the return-maximization objective of RL. This paper introduces D-Shape, a new method for combining IL and RL that uses ideas from reward shaping and goal-conditioned RL to resolve the above conflict. D-Shape allows learning from suboptimal demonstrations while retaining the ability to find the optimal policy with respect to the task reward. We experimentally validate D-Shape in sparse-reward gridworld domains, showing that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics