TL;DR
This paper introduces IQ, a novel deep Q-learning method that uses online clustering and knowledge distillation to reduce catastrophic interference, improving stability and performance in reinforcement learning tasks.
Contribution
The paper proposes a new approach combining context division and knowledge distillation to mitigate catastrophic interference in deep reinforcement learning.
Findings
IQ outperforms existing methods on control and Atari benchmarks.
The approach enhances stability and learning efficiency.
Extensive experiments validate the effectiveness of IQ.
Abstract
The powerful learning ability of deep neural networks enables reinforcement learning agents to learn competent control policies directly from continuous environments. In theory, to achieve stable performance, neural networks assume i.i.d. inputs, which unfortunately does no hold in the general reinforcement learning paradigm where the training data is temporally correlated and non-stationary. This issue may lead to the phenomenon of "catastrophic interference" and the collapse in performance. In this paper, we present IQ, i.e., interference-aware deep Q-learning, to mitigate catastrophic interference in single-task deep reinforcement learning. Specifically, we resort to online clustering to achieve on-the-fly context division, together with a multi-head network and a knowledge distillation regularization term for preserving the policy of learned contexts. Built upon deep Q networks, IQ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
