Towards a Practical Understanding of Lagrangian Methods in Safe Reinforcement Learning

Lindsay Spoor; \'Alvaro Serra-G\'omez; Aske Plaat; Thomas Moerland

arXiv:2510.17564·cs.LG·March 24, 2026

Towards a Practical Understanding of Lagrangian Methods in Safe Reinforcement Learning

Lindsay Spoor, \'Alvaro Serra-G\'omez, Aske Plaat, Thomas Moerland

PDF

Open Access

TL;DR

This paper empirically analyzes Lagrangian methods in safe reinforcement learning, revealing the sensitivity of the Lagrange multiplier and the importance of cost limit selection, supported by Pareto frontiers and an open-source benchmark.

Contribution

It provides a systematic empirical study of the trade-offs in safe RL, introduces Pareto frontiers for visualization, and offers guidelines for cost limit selection and an open-source code base.

Findings

01

Lagrange multiplier sensitivity varies across tasks and regimes.

02

Cost restrictiveness can differ within the same task.

03

Careful cost limit selection is crucial for evaluating safe RL methods.

Abstract

Safe reinforcement learning addresses constrained optimization problems where maximizing performance must be balanced against safety constraints, and Lagrangian methods are a widely used approach for this purpose. However, the effectiveness of Lagrangian methods depends crucially on the choice of the Lagrange multiplier $λ$ , which governs the multi-objective trade-off between return and cost. A common practice is to update the multiplier automatically during training. Although this approach is standard in practice, there remains limited empirical evidence on the optimally achievable trade-off between return and cost as a function of $λ$ , and there is currently no systematic benchmark comparing automated update mechanisms to this empirical optimum. Therefore, we study (i) the constraint geometry for eight widely used safety tasks and (ii) the previously overlooked…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research