From Pixels to Torques: Policy Learning with Deep Dynamical Models

Niklas Wahlstr\"om; Thomas B. Sch\"on; Marc Peter Deisenroth

arXiv:1502.02251·stat.ML·June 19, 2015

From Pixels to Torques: Policy Learning with Deep Dynamical Models

Niklas Wahlstr\"om, Thomas B. Sch\"on, Marc Peter Deisenroth

PDF

TL;DR

This paper presents a data-efficient, model-based reinforcement learning approach that learns control policies directly from pixel data using deep dynamical models, enabling autonomous systems to learn from high-dimensional visual inputs.

Contribution

It introduces a novel deep dynamical model combining auto-encoders and predictive modeling for pixel-based control, advancing autonomous learning from high-dimensional visual data.

Findings

01

Learns control policies directly from pixel data efficiently.

02

Scales to high-dimensional state spaces.

03

Outperforms existing reinforcement learning methods in data efficiency.

Abstract

Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.