Adapting to Reward Progressivity via Spectral Reinforcement Learning
Michael Dann, John Thangarajah

TL;DR
This paper introduces Spectral DQN, a novel reinforcement learning method that decomposes rewards into frequencies to better handle tasks with increasing reward magnitudes over time, improving performance in challenging domains.
Contribution
Spectral DQN is a new approach that decomposes rewards into frequency components, balancing training loss and enhancing learning in reward-progressive tasks.
Findings
Outperforms standard methods in reward-progressive domains
Remains competitive on standard Atari games
Shows potential advantages beyond its initial focus
Abstract
In this paper we consider reinforcement learning tasks with progressive rewards; that is, tasks where the rewards tend to increase in magnitude over time. We hypothesise that this property may be problematic for value-based deep reinforcement learning agents, particularly if the agent must first succeed in relatively unrewarding regions of the task in order to reach more rewarding regions. To address this issue, we propose Spectral DQN, which decomposes the reward into frequencies such that the high frequencies only activate when large rewards are found. This allows the training loss to be balanced so that it gives more even weighting across small and large reward regions. In two domains with extreme reward progressivity, where standard value-based methods struggle significantly, Spectral DQN is able to make much farther progress. Moreover, when evaluated on a set of six standard Atari…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Neural and Behavioral Psychology Studies
MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network
