Hybrid Reward Architecture for Reinforcement Learning
Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, and Tavian Barnes, Jeffrey Tsang

TL;DR
This paper introduces Hybrid Reward Architecture (HRA), a reinforcement learning method that decomposes reward functions to improve learning efficiency and stability in complex domains, demonstrated on Atari Ms. Pac-Man.
Contribution
HRA is a novel approach that learns separate value functions for decomposed rewards, facilitating better generalisation in challenging RL environments.
Findings
HRA outperforms traditional methods on a toy problem.
HRA achieves above-human performance on Ms. Pac-Man.
Decomposing rewards improves learning stability.
Abstract
One of the main challenges in reinforcement learning (RL) is generalisation. In typical deep RL methods this is achieved by approximating the optimal value function with a low-dimensional representation using a deep network. While this approach works well in many domains, in domains where the optimal value function cannot easily be reduced to a low-dimensional representation, learning can be very slow and unstable. This paper contributes towards tackling such challenging domains, by proposing a new method, called Hybrid Reward Architecture (HRA). HRA takes as input a decomposed reward function and learns a separate value function for each component reward function. Because each component typically only depends on a subset of all features, the corresponding value function can be approximated more easily by a low-dimensional representation, enabling more effective learning. We demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Receptor Mechanisms and Signaling · Evolutionary Algorithms and Applications
