Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
Saptarshi Nath, Christos Peridis, Eseoghene Ben-Iwhiwhu, Xinran Liu,, Shirin Dora, Cong Liu, Soheil Kolouri, Andrea Soltoggio

TL;DR
This paper proposes a distributed lifelong reinforcement learning framework where agents share task-specific knowledge via modulating masks, enabling robust, efficient, and scalable multi-agent learning with on-demand knowledge exchange.
Contribution
It introduces a novel distributed lifelong learning approach using modulating masks for knowledge sharing among multiple agents in asynchronous environments.
Findings
On-demand mask communication improves learning efficiency.
The system is robust to connection drops.
Achieves better performance than baseline RL methods.
Abstract
Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime. This involves the ability to exploit previous knowledge when learning new tasks and to avoid forgetting. Modulating masks, a specific type of parameter isolation approach, have recently shown promise in both supervised and reinforcement learning. While lifelong learning algorithms have been investigated mainly within a single-agent approach, a question remains on how multiple agents can share lifelong learning knowledge with each other. We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed and decentralized system of lifelong learners. The key idea is that the isolation of specific task knowledge to specific masks allows agents to transfer only specific knowledge on-demand, resulting in robust and effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Age of Information Optimization
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · RMSProp · Gradient Clipping · Decentralized Distributed Proximal Policy Optimization · Experience Replay · Sigmoid Activation · Convolution · Entropy Regularization · Tanh Activation
