Data-Efficient Learning of Feedback Policies from Image Pixels using   Deep Dynamical Models

John-Alexander M. Assael; Niklas Wahlstr\"om; Thomas B. Sch\"on; Marc; Peter Deisenroth

arXiv:1510.02173·cs.AI·October 12, 2015·24 cites

Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models

John-Alexander M. Assael, Niklas Wahlstr\"om, Thomas B. Sch\"on, Marc, Peter Deisenroth

PDF

Open Access

TL;DR

This paper presents a data-efficient, model-based reinforcement learning approach that learns closed-loop control policies directly from high-dimensional pixel observations using deep dynamical models, enabling scalable and autonomous control.

Contribution

It introduces a novel joint learning method for low-dimensional feature embedding and predictive modeling from pixel data, advancing end-to-end reinforcement learning from pixels to control torques.

Findings

01

Learns control policies quickly from pixel data.

02

Scales effectively to high-dimensional image observations.

03

Outperforms existing RL methods in data efficiency and scalability.

Abstract

Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ("torques") from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model for learning a low-dimensional feature embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning is crucial for long-term predictions, which lie at the core of the adaptive nonlinear model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Adaptive Dynamic Programming Control