Value constrained model-free continuous control
Steven Bohez, Abbas Abdolmaleki, Michael Neunert, Jonas Buchli,, Nicolas Heess, Raia Hadsell

TL;DR
This paper introduces a constraint-based reinforcement learning method that ensures task success while minimizing auxiliary costs like control effort, leading to smoother and more practical control policies in continuous control tasks.
Contribution
It proposes a novel Lagrangian relaxation approach for learning control policies that satisfy constraints either in expectation or per-step, with the ability to trade-off between return and cost dynamically.
Findings
Effective in continuous control benchmarks
Successfully applied to energy-efficient quadruped locomotion
Demonstrated on real robot arm reaching task
Abstract
The naive application of Reinforcement Learning algorithms to continuous control problems -- such as locomotion and manipulation -- often results in policies which rely on high-amplitude, high-frequency control signals, known colloquially as bang-bang control. Although such solutions may indeed maximize task reward, they can be unsuitable for real world systems. Bang-bang control may lead to increased wear and tear or energy consumption, and tends to excite undesired second-order dynamics. To counteract this issue, multi-objective optimization can be used to simultaneously optimize both the reward and some auxiliary cost that discourages undesired (e.g. high-amplitude) control. In principle, such an approach can yield the sought after, smooth, control policies. It can, however, be hard to find the correct trade-off between cost and return that results in the desired behavior. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Locomotion and Control
