Temporal Difference Learning for Model Predictive Control

Nicklas Hansen; Xiaolong Wang; Hao Su

arXiv:2203.04955·cs.LG·July 21, 2022·27 cites

Temporal Difference Learning for Model Predictive Control

Nicklas Hansen, Xiaolong Wang, Hao Su

PDF

Open Access 2 Repos 5 Datasets 2 Videos

TL;DR

This paper introduces TD-MPC, a hybrid model predictive control method that combines learned latent dynamics and value functions with temporal difference learning, achieving improved sample efficiency and performance in continuous control tasks.

Contribution

The paper presents TD-MPC, a novel approach that integrates model-based and model-free techniques using learned latent models and value functions optimized via temporal difference learning.

Findings

01

TD-MPC outperforms prior methods in sample efficiency.

02

TD-MPC achieves superior asymptotic performance.

03

Effective on both state and image-based control tasks.

Abstract

Data-driven model predictive control has two key advantages over model-free methods: a potential for improved sample efficiency through model learning, and better performance as computational budget for planning increases. However, it is both costly to plan over long horizons and challenging to obtain an accurate model of the environment. In this work, we combine the strengths of model-free and model-based methods. We use a learned task-oriented latent dynamics model for local trajectory optimization over a short horizon, and use a learned terminal value function to estimate long-term return, both of which are learned jointly by temporal difference learning. Our method, TD-MPC, achieves superior sample efficiency and asymptotic performance over prior work on both state and image-based continuous control tasks from DMControl and Meta-World. Code and video results are available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

TD-MPC Explained, With Alexander Soare (Part 2 of 2)· youtube

TD-MPC Explained, With Alexander Soare (Part 1 of 2)· youtube

Taxonomy

TopicsAdvanced Control Systems Optimization · Metabolomics and Mass Spectrometry Studies · Machine Learning and Algorithms