Pretraining & Reinforcement Learning: Sharpening the Axe Before Cutting the Tree
Saurav Kadavath, Samuel Paradis, Brian Yao

TL;DR
This paper evaluates the impact of pretraining on deep reinforcement learning performance, showing that relevant datasets improve training efficiency and proposing optimal step division strategies for limited environment interactions.
Contribution
It demonstrates that pretraining on relevant datasets enhances RL training efficiency and introduces methods to optimize step allocation between pretraining and RL.
Findings
Pretraining on irrelevant datasets is ineffective due to learned filters.
Pretraining on in-distribution datasets reduces RL training time.
Optimal step division improves RL performance with limited environment steps.
Abstract
Pretraining is a common technique in deep learning for increasing performance and reducing training time, with promising experimental results in deep reinforcement learning (RL). However, pretraining requires a relevant dataset for training. In this work, we evaluate the effectiveness of pretraining for RL tasks, with and without distracting backgrounds, using both large, publicly available datasets with minimal relevance, as well as case-by-case generated datasets labeled via self-supervision. Results suggest filters learned during training on less relevant datasets render pretraining ineffective, while filters learned during training on the in-distribution datasets reliably reduce RL training time and improve performance after 80k RL training steps. We further investigate, given a limited number of environment steps, how to optimally divide the available steps into pretraining and RL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Cognitive Science and Mapping · Behavioral and Psychological Studies
