Self-Activating Neural Ensembles for Continual Reinforcement Learning
Sam Powers, Eliot Xing, Abhinav Gupta

TL;DR
Self-Activating Neural Ensembles (SANE) is a task-agnostic, modular approach for continual reinforcement learning that prevents catastrophic forgetting by activating and updating only relevant modules, enabling lifelong skill acquisition.
Contribution
Introduces SANE, a modular, task-agnostic framework that dynamically creates and activates modules to retain old skills while learning new ones without task boundaries.
Findings
Effective in visually rich procedurally generated environments
Prevents catastrophic forgetting during continual learning
Dynamically creates modules as needed
Abstract
The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents. Most methods devised to address this problem depend heavily on well-defined task boundaries, and thus depend on human supervision. Our task-agnostic method, Self-Activating Neural Ensembles (SANE), uses a modular architecture designed to avoid catastrophic forgetting without making any such assumptions. At the beginning of each trajectory, a module in the SANE ensemble is activated to determine the agent's next policy. During training, new modules are created as needed and only activated modules are updated to ensure that unused modules remain unchanged. This system enables our method to retain and leverage old skills, while growing and learning new ones. We demonstrate our approach on visually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)
