Finetuning Offline World Models in the Real World
Yunhai Feng, Nicklas Hansen, Ziyan Xiong, Chandramouli Rajagopalan,, Xiaolong Wang

TL;DR
This paper introduces a method for pretraining world models with offline data and then finetuning them online for robotic tasks, effectively reducing data requirements and improving adaptability.
Contribution
It proposes a novel approach combining offline pretraining and online finetuning of world models with regularization to handle distribution shifts.
Findings
Enables few-shot finetuning on real robots.
Effective on visuo-motor control tasks in simulation and real world.
Reduces data needs for training robotic policies.
Abstract
Reinforcement Learning (RL) is notoriously data-inefficient, which makes training on a real robot difficult. While model-based RL algorithms (world models) improve data-efficiency to some extent, they still require hours or days of interaction to learn skills. Recently, offline RL has been proposed as a framework for training RL policies on pre-existing datasets without any online interaction. However, constraining an algorithm to a fixed dataset induces a state-action distribution shift between training and inference, and limits its applicability to new tasks. In this work, we seek to get the best of both worlds: we consider the problem of pretraining a world model with offline data collected on a real robot, and then finetuning the model on online data collected by planning with the learned model. To mitigate extrapolation errors during online interaction, we propose to regularize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
