RLBenchNet: The Right Network for the Right Reinforcement Learning Task
Ivan Smirnov, Shangding Gu

TL;DR
This paper systematically evaluates various neural network architectures in reinforcement learning tasks, revealing their strengths and limitations across different environments, and providing guidance for architecture selection based on task complexity and resource constraints.
Contribution
It offers a comprehensive comparison of neural network architectures in RL, highlighting the performance and efficiency trade-offs, and introduces Mamba as a high-throughput alternative with competitive results.
Findings
MLPs excel in fully observable continuous control tasks.
Recurrent architectures like LSTM and GRU perform well in partially observable environments.
Mamba achieves 4.5x higher throughput than LSTM and 3.9x over GRU, with comparable performance.
Abstract
Reinforcement learning (RL) has seen significant advancements through the application of various neural network architectures. In this study, we systematically investigate the performance of several neural networks in RL tasks, including Long Short-Term Memory (LSTM), Multi-Layer Perceptron (MLP), Mamba/Mamba-2, Transformer-XL, Gated Transformer-XL, and Gated Recurrent Unit (GRU). Through comprehensive evaluation across continuous control, discrete decision-making, and memory-based environments, we identify architecture-specific strengths and limitations. Our results reveal that: (1) MLPs excel in fully observable continuous control tasks, providing an optimal balance of performance and efficiency; (2) recurrent architectures like LSTM and GRU offer robust performance in partially observable environments with moderate memory requirements; (3) Mamba models achieve a 4.5x higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsAttention Is All You Need · Softmax · Cosine Annealing · Variational Dropout · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Residual Connection · Dropout · Adam · Adaptive Input Representations
