A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning
Zhi Wang, Chunlin Chen, Daoyi Dong

TL;DR
This paper introduces a scalable lifelong reinforcement learning approach using a Dirichlet process mixture model that dynamically expands and adapts to new tasks, preventing forgetting and improving generalization.
Contribution
The paper proposes a non-parametric Bayesian framework for lifelong RL that automatically adjusts model complexity and clusters tasks without explicit boundaries or heuristics.
Findings
Outperforms existing methods in robot navigation and locomotion tasks.
Effectively prevents catastrophic forgetting in lifelong learning.
Demonstrates scalable adaptation to non-stationary task distributions.
Abstract
While reinforcement learning (RL) algorithms are achieving state-of-the-art performance in various challenging tasks, they can easily encounter catastrophic forgetting or interference when faced with lifelong streaming information. In the paper, we propose a scalable lifelong RL method that dynamically expands the network capacity to accommodate new knowledge while preventing past memories from being perturbed. We use a Dirichlet process mixture to model the non-stationary task distribution, which captures task relatedness by estimating the likelihood of task-to-cluster assignments and clusters the task models in a latent space. We formulate the prior distribution of the mixture as a Chinese restaurant process (CRP) that instantiates new mixture components as needed. The update and expansion of the mixture are governed by the Bayesian non-parametric framework with an expectation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
