Catastrophic Interference in Reinforcement Learning: A Solution Based on   Context Division and Knowledge Distillation

Tiantian Zhang; Xueqian Wang; Bin Liang; Bo Yuan

arXiv:2109.00525·cs.LG·September 2, 2022

Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation

Tiantian Zhang, Xueqian Wang, Bin Liang, Bo Yuan

PDF

1 Repo

TL;DR

This paper introduces IQ, a novel deep Q-learning method that uses online clustering and knowledge distillation to reduce catastrophic interference, improving stability and performance in reinforcement learning tasks.

Contribution

The paper proposes a new approach combining context division and knowledge distillation to mitigate catastrophic interference in deep reinforcement learning.

Findings

01

IQ outperforms existing methods on control and Atari benchmarks.

02

The approach enhances stability and learning efficiency.

03

Extensive experiments validate the effectiveness of IQ.

Abstract

The powerful learning ability of deep neural networks enables reinforcement learning agents to learn competent control policies directly from continuous environments. In theory, to achieve stable performance, neural networks assume i.i.d. inputs, which unfortunately does no hold in the general reinforcement learning paradigm where the training data is temporally correlated and non-stationary. This issue may lead to the phenomenon of "catastrophic interference" and the collapse in performance. In this paper, we present IQ, i.e., interference-aware deep Q-learning, to mitigate catastrophic interference in single-task deep reinforcement learning. Specifically, we resort to online clustering to achieve on-the-fly context division, together with a multi-head network and a knowledge distillation regularization term for preserving the policy of learned contexts. Built upon deep Q networks, IQ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sweety-dm/interference-aware-deep-q-learning
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation