Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing   Shaped Rewards

Alexander Trott; Stephan Zheng; Caiming Xiong; Richard Socher

arXiv:1911.01417·cs.AI·November 5, 2019·21 cites

Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards

Alexander Trott, Stephan Zheng, Caiming Xiong, Richard Socher

PDF

Open Access 1 Repo

TL;DR

This paper presents a simple, model-free method that uses auxiliary distance-based rewards to improve exploration and solve sparse reward tasks effectively without additional reward engineering.

Contribution

It introduces a novel auxiliary reward mechanism based on pairs of rollouts that prevents local optima and enhances learning in sparse reward environments.

Findings

01

Successfully solves maze navigation and 3D construction tasks.

02

Outperforms intrinsic curiosity and reward relabeling strategies.

03

Does not require domain-specific reward engineering.

Abstract

While using shaped rewards can be beneficial when solving sparse reward tasks, their successful application often requires careful engineering and is problem specific. For instance, in tasks where the agent must achieve some goal state, simple distance-to-goal reward shaping often fails, as it renders learning vulnerable to local optima. We introduce a simple and effective model-free method to learn from shaped distance-to-goal rewards on tasks where success depends on reaching a goal state. Our method introduces an auxiliary distance-based reward based on pairs of rollouts to encourage diverse exploration. This approach effectively prevents learning dynamics from stabilizing around local optima induced by the naive distance-to-goal reward shaping and enables policies to efficiently solve sparse reward tasks. Our augmented objective does not require any additional reward engineering or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salesforce/sibling-rivalry
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization