Differentiable Physics Models for Real-world Offline Model-based   Reinforcement Learning

Michael Lutter; Johannes Silberbauer; Joe Watson; Jan Peters

arXiv:2011.01734·cs.RO·November 4, 2020

Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning

Michael Lutter, Johannes Silberbauer, Joe Watson, Jan Peters

PDF

TL;DR

This paper demonstrates that physics-based models, when the mechanical structure is known, outperform black-box models in offline model-based reinforcement learning for real-world tasks, enabling effective learning with minimal data.

Contribution

It introduces a physics-based modeling approach for offline MBRL that leverages known mechanical structures and automatic differentiation, showing superior performance over black-box models in a real-world task.

Findings

01

Physics-based models outperform black-box models in the BiC task.

02

Black-box models produce diverging trajectories and unviable policies.

03

Physics models learn effectively with only 4 minutes of data.

Abstract

A limitation of model-based reinforcement learning (MBRL) is the exploitation of errors in the learned models. Black-box models can fit complex dynamics with high fidelity, but their behavior is undefined outside of the data distribution.Physics-based models are better at extrapolating, due to the general validity of their informed structure, but underfit in the real world due to the presence of unmodeled phenomena. In this work, we demonstrate experimentally that for the offline model-based reinforcement learning setting, physics-based models can be beneficial compared to high-capacity function approximators if the mechanical structure is known. Physics-based models can learn to perform the ball in a cup (BiC) task on a physical manipulator using only 4 minutes of sampled data using offline MBRL. We find that black-box models consistently produce unviable policies for BiC as all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.