Lipschitz-Regularized Critics Lead to Policy Robustness Against Transition Dynamics Uncertainty
Xulin Chen, Ruipeng Liu, Zhenyu Gan, Garrett E. Katz

TL;DR
This paper introduces PPO-PGDLC, a reinforcement learning algorithm that combines Lipschitz-regularized critics with adversarial state sampling to enhance policy robustness against transition uncertainties, validated on control and robotic tasks.
Contribution
It proposes a novel RL method integrating Lipschitz regularization with adversarial state sampling, addressing critic-only regularization impact and real-world robustness validation.
Findings
PPO-PGDLC outperforms baseline algorithms in control and robotic tasks.
The method produces smoother actions under environmental perturbations.
Experimental results demonstrate improved robustness and performance.
Abstract
Uncertainties in transition dynamics pose a critical challenge in reinforcement learning (RL), often resulting in performance degradation of trained policies when deployed on hardware. Many robust RL approaches follow two strategies: enforcing smoothness in actor or actor-critic modules with Lipschitz regularization, or learning robust Bellman operators. However, the first strategy does not investigate the impact of critic-only Lipschitz regularization on policy robustness, while the second lacks comprehensive validation in real-world scenarios. Building on this gap and prior work, we propose PPO-PGDLC, an algorithm based on Proximal Policy Optimization (PPO) that integrates Projected Gradient Descent (PGD) with a Lipschitz-regularized critic (LC). The PGD component calculates the adversarial state within an uncertainty set to approximate the robust Bellman operator, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
