Adapting to Reward Progressivity via Spectral Reinforcement Learning

Michael Dann; John Thangarajah

arXiv:2104.14138·cs.LG·April 30, 2021·1 cites

Adapting to Reward Progressivity via Spectral Reinforcement Learning

Michael Dann, John Thangarajah

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Spectral DQN, a novel reinforcement learning method that decomposes rewards into frequencies to better handle tasks with increasing reward magnitudes over time, improving performance in challenging domains.

Contribution

Spectral DQN is a new approach that decomposes rewards into frequency components, balancing training loss and enhancing learning in reward-progressive tasks.

Findings

01

Outperforms standard methods in reward-progressive domains

02

Remains competitive on standard Atari games

03

Shows potential advantages beyond its initial focus

Abstract

In this paper we consider reinforcement learning tasks with progressive rewards; that is, tasks where the rewards tend to increase in magnitude over time. We hypothesise that this property may be problematic for value-based deep reinforcement learning agents, particularly if the agent must first succeed in relatively unrewarding regions of the task in order to reach more rewarding regions. To address this issue, we propose Spectral DQN, which decomposes the reward into frequencies such that the high frequencies only activate when large rewards are found. This allows the training loss to be balanced so that it gives more even weighting across small and large reward regions. In two domains with extreme reward progressivity, where standard value-based methods struggle significantly, Spectral DQN is able to make much farther progress. Moreover, when evaluated on a set of six standard Atari…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mchldann/SpectralDQN
noneOfficial

Videos

Adapting to Reward Progressivity via Spectral Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Neural and Behavioral Psychology Studies

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network