A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman, Johan Suykens

TL;DR
This paper introduces a unified primal-dual framework for reinforcement learning that enables imposing diverse behavioral constraints on policies, including novel types, by linking dual variables to reward modifications.
Contribution
It unifies existing constraint techniques within a primal-dual approach and introduces new constraints, providing a versatile method for policy regularization in reinforcement learning.
Findings
The DualCRL method effectively enforces various constraints during training.
The framework reveals a relationship between dual constraints and reward modifications.
Experimental results demonstrate the method's versatility and efficacy.
Abstract
Model-free reinforcement learning methods lack an inherent mechanism to impose behavioural constraints on the trained policies. Although certain extensions exist, they remain limited to specific types of constraints, such as value constraints with additional reward signals or visitation density constraints. In this work we unify these existing techniques and bridge the gap with classical optimization and control theory, using a generic primal-dual framework for value-based and actor-critic reinforcement learning methods. The obtained dual formulations turn out to be especially useful for imposing additional constraints on the learned policy, as an intrinsic relationship between such dual constraints (or regularization terms) and reward modifications in the primal is revealed. Furthermore, using this framework, we are able to introduce some novel types of constraints, allowing to impose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Auction Theory and Applications
