Temporal Difference Learning for Model Predictive Control
Nicklas Hansen, Xiaolong Wang, Hao Su

TL;DR
This paper introduces TD-MPC, a hybrid model predictive control method that combines learned latent dynamics and value functions with temporal difference learning, achieving improved sample efficiency and performance in continuous control tasks.
Contribution
The paper presents TD-MPC, a novel approach that integrates model-based and model-free techniques using learned latent models and value functions optimized via temporal difference learning.
Findings
TD-MPC outperforms prior methods in sample efficiency.
TD-MPC achieves superior asymptotic performance.
Effective on both state and image-based control tasks.
Abstract
Data-driven model predictive control has two key advantages over model-free methods: a potential for improved sample efficiency through model learning, and better performance as computational budget for planning increases. However, it is both costly to plan over long horizons and challenging to obtain an accurate model of the environment. In this work, we combine the strengths of model-free and model-based methods. We use a learned task-oriented latent dynamics model for local trajectory optimization over a short horizon, and use a learned terminal value function to estimate long-term return, both of which are learned jointly by temporal difference learning. Our method, TD-MPC, achieves superior sample efficiency and asymptotic performance over prior work on both state and image-based continuous control tasks from DMControl and Meta-World. Code and video results are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
TD-MPC Explained, With Alexander Soare (Part 2 of 2)· youtube
TD-MPC Explained, With Alexander Soare (Part 1 of 2)· youtube
Taxonomy
TopicsAdvanced Control Systems Optimization · Metabolomics and Mass Spectrometry Studies · Machine Learning and Algorithms
