Sample-efficient Cross-Entropy Method for Real-time Planning

Cristina Pinneri; Shambhuraj Sawant; Sebastian Blaes; Jan Achterhold,; Joerg Stueckler; Michal Rolinek; Georg Martius

arXiv:2008.06389·cs.LG·August 17, 2020·25 cites

Sample-efficient Cross-Entropy Method for Real-time Planning

Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold,, Joerg Stueckler, Michal Rolinek, Georg Martius

PDF

Open Access 1 Repo

TL;DR

This paper introduces an enhanced Cross-Entropy Method that significantly reduces sampling requirements and improves performance for real-time control in high-dimensional reinforcement learning tasks.

Contribution

The paper presents a novel, more sample-efficient CEM variant with temporally-correlated actions and memory, enabling real-time planning in complex environments.

Findings

01

Requires 2.7-22x fewer samples

02

Achieves 1.2-10x performance improvement

03

Effective in high-dimensional control tasks

Abstract

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

martius-lab/iCEM
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics