Learning Intrinsic Symbolic Rewards in Reinforcement Learning

Hassam Sheikh; Shauharda Khadka; Santiago Miret; Somdeb Majumdar

arXiv:2010.03694·cs.LG·October 12, 2020·1 cites

Learning Intrinsic Symbolic Rewards in Reinforcement Learning

Hassam Sheikh, Shauharda Khadka, Santiago Miret, Somdeb Majumdar

PDF

Open Access

TL;DR

This paper introduces a method to discover low-dimensional symbolic rewards in reinforcement learning, improving interpretability and effectiveness over existing neural network-based reward discovery methods.

Contribution

The paper presents a novel approach that finds symbolic tree-based dense rewards, enhancing interpretability and outperforming neural network methods in various RL environments.

Findings

01

Symbolic rewards improve policy learning in sparse reward settings.

02

The method outperforms neural network-based reward discovery algorithms.

03

Effective in both continuous and discrete action spaces.

Abstract

Learning effective policies for sparse objectives is a key challenge in Deep Reinforcement Learning (RL). A common approach is to design task-related dense rewards to improve task learnability. While such rewards are easily interpreted, they rely on heuristics and domain expertise. Alternate approaches that train neural networks to discover dense surrogate rewards avoid heuristics, but are high-dimensional, black-box solutions offering little interpretability. In this paper, we present a method that discovers dense rewards in the form of low-dimensional symbolic trees - thus making them more tractable for analysis. The trees use simple functional operators to map an agent's observations to a scalar reward, which then supervises the policy gradient learning of a neural network policy. We test our method on continuous action spaces in Mujoco and discrete action spaces in Atari and Pygame…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications