On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks

Nicholas H. Barbara; Ruigang Wang; Ian R. Manchester

arXiv:2405.11432·cs.LG·February 7, 2025·2 cites

On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks

Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester

PDF

Open Access 2 Repos

TL;DR

This paper explores how Lipschitz-bounded policy networks in deep reinforcement learning enhance robustness against disturbances and adversarial attacks, showing that certain Lipschitz layer structures improve performance without degrading accuracy.

Contribution

It demonstrates that Lipschitz-bounded policy parameterizations improve robustness in reinforcement learning and compares different Lipschitz layer structures for effectiveness.

Findings

01

Lipschitz-bounded policies are more robust to noise and attacks.

02

Sandwich layers outperform spectral normalization in robustness and performance.

03

Smaller Lipschitz bounds lead to increased robustness.

Abstract

This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with smaller Lipschitz bounds are more robust to disturbances, random noise, and targeted adversarial attacks than unconstrained policies composed of vanilla multi-layer perceptrons or convolutional neural networks. However, the structure of the Lipschitz layer is important. We find that the widely-used method of spectral normalization is too conservative and severely impacts clean performance, whereas more expressive Lipschitz layers such as the recently-proposed Sandwich layer can achieve improved robustness without sacrificing clean…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control

MethodsSpectral Normalization