Sample-efficient Cross-Entropy Method for Real-time Planning
Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold,, Joerg Stueckler, Michal Rolinek, Georg Martius

TL;DR
This paper introduces an enhanced Cross-Entropy Method that significantly reduces sampling requirements and improves performance for real-time control in high-dimensional reinforcement learning tasks.
Contribution
The paper presents a novel, more sample-efficient CEM variant with temporally-correlated actions and memory, enabling real-time planning in complex environments.
Findings
Requires 2.7-22x fewer samples
Achieves 1.2-10x performance improvement
Effective in high-dimensional control tasks
Abstract
Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics
