Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Tom Schaul, Diana Borsa, Joseph Modayil, Razvan Pascanu

TL;DR
This paper investigates ray interference in deep reinforcement learning, revealing how it causes performance plateaus due to coupling between learning and data generation, and discusses conditions, properties, and potential remedies.
Contribution
It identifies and analyzes ray interference as a source of learning plateaus in RL, providing theoretical insights and conditions for its occurrence.
Findings
Ray interference causes performance plateaus in RL.
Conditions for ray interference are characterized.
Potential remedies for ray interference are discussed.
Abstract
Rather than proposing a new method, this paper investigates an issue present in existing learning algorithms. We study the learning dynamics of reinforcement learning (RL), specifically a characteristic coupling between learning and data generation that arises because RL agents control their future data distribution. In the presence of function approximation, this coupling can lead to a problematic type of 'ray interference', characterized by learning dynamics that sequentially traverse a number of performance plateaus, effectively constraining the agent to learn one thing at a time even when learning in parallel is better. We establish the conditions under which ray interference occurs, show its relation to saddle points and obtain the exact learning dynamics in a restricted setting. We characterize a number of its properties and discuss possible remedies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Gene Regulatory Network Analysis
