Trajectory Entropy Reinforcement Learning for Predictable and Robust Control
Bang You, Chenxu Wang, Huaping Liu

TL;DR
This paper introduces Trajectory Entropy Reinforcement Learning, a method that promotes simple, predictable policies by minimizing the entropy of action trajectories, leading to more robust control in complex tasks.
Contribution
It proposes a novel entropy-based inductive bias for reinforcement learning, optimizing policies for simplicity and robustness through trajectory entropy minimization.
Findings
Policies are more cyclical and consistent.
Achieves superior performance in high-dimensional tasks.
Demonstrates increased robustness to noise and environment changes.
Abstract
Simplicity is a critical inductive bias for designing data-driven controllers, especially when robustness is important. Despite the impressive results of deep reinforcement learning in complex control tasks, it is prone to capturing intricate and spurious correlations between observations and actions, leading to failure under slight perturbations to the environment. To tackle this problem, in this work we introduce a novel inductive bias towards simple policies in reinforcement learning. The simplicity inductive bias is introduced by minimizing the entropy of entire action trajectories, corresponding to the number of bits required to describe information in action trajectories after the agent observes state trajectories. Our reinforcement learning agent, Trajectory Entropy Reinforcement Learning, is optimized to minimize the trajectory entropy while maximizing rewards. We show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Locomotion and Control
