Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Michal Nauman, Mateusz Ostaszewski, Krzysztof Jankowski, Piotr, Mi{\l}o\'s, Marek Cygan

TL;DR
This paper shows that scaling model capacity combined with regularization and optimistic exploration can significantly improve sample efficiency and performance in continuous control reinforcement learning tasks.
Contribution
It introduces the BRO algorithm, which leverages strong regularization and scaling to achieve state-of-the-art results in complex RL benchmarks.
Findings
BRO outperforms existing algorithms on 40 tasks
Achieves near-optimal policies in challenging tasks
Scaling with regularization enhances RL performance
Abstract
Sample efficiency in Reinforcement Learning (RL) has traditionally been driven by algorithmic enhancements. In this work, we demonstrate that scaling can also lead to substantial improvements. We conduct a thorough investigation into the interplay of scaling model capacity and domain-specific RL enhancements. These empirical findings inform the design choices underlying our proposed BRO (Bigger, Regularized, Optimistic) algorithm. The key innovation behind BRO is that strong regularization allows for effective scaling of the critic networks, which, paired with optimistic exploration, leads to superior performance. BRO achieves state-of-the-art results, significantly outperforming the leading model-based and model-free algorithms across 40 complex tasks from the DeepMind Control, MetaWorld, and MyoSuite benchmarks. BRO is the first model-free algorithm to achieve near-optimal policies in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Control Systems and Identification · Neural Networks and Applications
