On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning
Zhao Mandi, Pieter Abbeel, Stephen James

TL;DR
This paper compares fine-tuning and meta-reinforcement learning in vision-based benchmarks, finding that multi-task pretraining with fine-tuning often matches or outperforms meta-RL, suggesting a simpler, effective alternative.
Contribution
It demonstrates that multi-task pretraining with fine-tuning can be as effective as meta-RL in complex environments, challenging the necessity of meta-learning.
Findings
Fine-tuning matches or outperforms meta-RL in vision-based benchmarks.
Meta-RL is often more complex and computationally expensive.
Multi-task pretraining is a strong, simple baseline for future research.
Abstract
Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However, meta-reinforcement learning (meta-RL) algorithms have thus far been restricted to simple environments with narrow task distributions. Moreover, the paradigm of pretraining followed by fine-tuning to adapt to new tasks has emerged as a simple yet effective solution in supervised and self-supervised learning. This calls into question the benefits of meta-learning approaches also in reinforcement learning, which typically come at the cost of high complexity. We hence investigate meta-RL approaches in a variety of vision-based benchmarks, including Procgen, RLBench, and Atari, where evaluations are made on completely novel tasks. Our findings show that when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Memory and Neural Computing · Machine Learning and ELM
