Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep   RL

Ghada Sokar; Johan Obando-Ceron; Aaron Courville; Hugo Larochelle,; Pablo Samuel Castro

arXiv:2410.01930·cs.LG·February 28, 2025

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Ghada Sokar, Johan Obando-Ceron, Aaron Courville, Hugo Larochelle,, Pablo Samuel Castro

PDF

Open Access 1 Video

TL;DR

This paper reveals that in deep reinforcement learning, tokenizing encoder outputs, rather than using multiple experts, is the main factor behind SoftMoE's performance improvements, even with a single expert.

Contribution

The study uncovers that tokenization, not the mixture of experts, drives SoftMoE's success in deep RL, challenging previous assumptions.

Findings

01

Tokenizing encoder outputs is key to SoftMoE's effectiveness.

02

Single expert models with tokenization can match SoftMoE performance.

03

Performance gains are largely due to tokenization rather than multiple experts.

Abstract

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reasons behind their effectiveness remain largely unknown. In this work we provide an in-depth analysis identifying the key factors driving this performance gain. We discover the surprising result that tokenizing the encoder output, rather than the use of multiple experts, is what is behind the efficacy of SoftMoEs. Indeed, we demonstrate that even with an appropriately scaled single expert, we are able to maintain the performance gains, largely thanks to tokenization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL· slideslive

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices