Mixtures of Experts Unlock Parameter Scaling for Deep RL

Johan Obando-Ceron; Ghada Sokar; Timon Willi; Clare Lyle; Jesse; Farebrother; Jakob Foerster; Gintare Karolina Dziugaite; Doina Precup; Pablo; Samuel Castro

arXiv:2402.08609·cs.LG·June 27, 2024·3 cites

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse, Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo, Samuel Castro

PDF

Open Access 1 Repo

TL;DR

This paper shows that integrating Mixture-of-Expert modules into value-based reinforcement learning models significantly improves their scalability with respect to parameter count, leading to better performance across various training setups.

Contribution

It introduces the use of Mixture-of-Expert modules in value-based RL networks, demonstrating improved parameter scalability and performance, providing empirical evidence for RL scaling laws.

Findings

01

MoE modules improve RL model scalability

02

Performance increases with model size using MoEs

03

Empirical evidence supports RL scaling laws

Abstract

The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model's performance scales proportionally to its size. Analogous scaling laws remain elusive for reinforcement learning domains, however, where increasing the parameter count of a model often hurts its final performance. In this paper, we demonstrate that incorporating Mixture-of-Expert (MoE) modules, and in particular Soft MoEs (Puigcerver et al., 2023), into value-based networks results in more parameter-scalable models, evidenced by substantial performance increases across a variety of training regimes and model sizes. This work thus provides strong empirical evidence towards developing scaling laws for reinforcement learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google/dopamine
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Anomaly Detection Techniques and Applications · Model Reduction and Neural Networks