IPO: Interior-point Policy Optimization under Constraints
Yongshuai Liu, Jiaxin Ding, Xin Liu

TL;DR
This paper introduces IPO, a reinforcement learning algorithm that optimizes policies while satisfying constraints using interior-point methods, demonstrating superior performance over existing approaches.
Contribution
The paper presents a novel first-order policy optimization algorithm, IPO, which effectively handles constrained RL problems with theoretical guarantees and practical efficiency.
Findings
IPO outperforms baselines in reward maximization
IPO effectively satisfies multiple cumulative constraints
The method is easy to implement and scalable
Abstract
In this paper, we study reinforcement learning (RL) algorithms to solve real-world decision problems with the objective of maximizing the long-term reward as well as satisfying cumulative constraints. We propose a novel first-order policy optimization method, Interior-point Policy Optimization (IPO), which augments the objective with logarithmic barrier functions, inspired by the interior-point method. Our proposed method is easy to implement with performance guarantees and can handle general types of cumulative multiconstraint settings. We conduct extensive evaluations to compare our approach with state-of-the-art baselines. Our algorithm outperforms the baseline algorithms, in terms of reward maximization and constraint satisfaction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Smart Parking Systems Research
