Compute-Optimal Scaling for Value-Based Deep RL
Preston Fu, Oleh Rybkin, Zhiyuan Zhou, Michal Nauman, Pieter Abbeel, Sergey Levine, Aviral Kumar

TL;DR
This paper explores how to optimally allocate compute resources in value-based deep reinforcement learning by analyzing the interplay between model size, batch size, and update-to-data ratio, introducing the concept of TD-overfitting.
Contribution
It provides a theoretical framework and practical guidelines for compute-optimal scaling in deep RL, highlighting the phenomenon of TD-overfitting and its implications.
Findings
Large models are less affected by batch size increases, enabling more efficient scaling.
TD-overfitting occurs in small models, reducing Q-function accuracy with larger batches.
Guidelines for balancing model capacity and update frequency to maximize compute efficiency.
Abstract
As models grow larger and training them becomes expensive, it becomes increasingly important to scale training recipes not just to larger models and more data, but to do so in a compute-optimal manner that extracts maximal performance per unit of compute. While such scaling has been well studied for language modeling, reinforcement learning (RL) has received less attention in this regard. In this paper, we investigate compute scaling for online, value-based deep RL. These methods present two primary axes for compute allocation: model capacity and the update-to-data (UTD) ratio. Given a fixed compute budget, we ask: how should resources be partitioned across these axes to maximize sample efficiency? Our analysis reveals a nuanced interplay between model size, batch size, and UTD. In particular, we identify a phenomenon we call TD-overfitting: increasing the batch quickly harms Q-function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
