TL;DR
This paper introduces LLIRL, an incremental reinforcement learning algorithm that uses online Bayesian inference and a nonparametric mixture model to adapt continuously to changing environments, improving lifelong learning performance.
Contribution
The paper proposes a novel lifelong RL method employing a Chinese restaurant process mixture model and online EM for dynamic environment adaptation, without prior environmental change signals.
Findings
LLIRL outperforms existing methods in dynamic environments.
The approach effectively adapts to environmental changes and reuses previous models.
Experiments validate the method's efficiency in lifelong learning scenarios.
Abstract
A central capability of a long-lived reinforcement learning (RL) agent is to incrementally adapt its behavior as its environment changes, and to incrementally build upon previous experiences to facilitate future learning in real-world scenarios. In this paper, we propose LifeLong Incremental Reinforcement Learning (LLIRL), a new incremental algorithm for efficient lifelong adaptation to dynamic environments. We develop and maintain a library that contains an infinite mixture of parameterized environment models, which is equivalent to clustering environment parameters in a latent space. The prior distribution over the mixture is formulated as a Chinese restaurant process (CRP), which incrementally instantiates new environment models without any external information to signal environmental changes in advance. During lifelong learning, we employ the expectation maximization (EM) algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
