Dynamic Mixture of Experts Against Severe Distribution Shifts
Donghu Kim

TL;DR
This paper evaluates a Dynamic Mixture of Experts approach for continual and reinforcement learning, aiming to improve adaptability and efficiency in handling evolving data distributions.
Contribution
It introduces and benchmarks a DynamicMoE method that dynamically expands capacity, addressing limitations of previous approaches in continual learning.
Findings
DynamicMoE improves adaptation to distribution shifts.
The method reduces catastrophic forgetting.
It demonstrates parameter efficiency compared to existing methods.
Abstract
The challenge of building neural networks that can continuously learn and adapt to evolving data streams is central to the fields of continual learning (CL) and reinforcement learning (RL). This lifelong learning problem is often framed in terms of the plasticity-stability dilemma, focusing on issues like loss of plasticity and catastrophic forgetting. Unlike neural networks, biological brains maintain plasticity through capacity growth, inspiring researchers to explore similar approaches in artificial networks, such as adding capacity dynamically. Prior solutions often lack parameter efficiency or depend on explicit task indices, but Mixture-of-Experts (MoE) architectures offer a promising alternative by specializing experts for distinct distributions. This paper aims to evaluate a DynamicMoE approach for continual and reinforcement learning environments and benchmark its effectiveness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Neural Networks and Applications
