Worrisome Properties of Neural Network Controllers and Their Symbolic Representations
Jacek Cyranka, Kevin E M Church, Jean-Philippe Lessard

TL;DR
This paper investigates the robustness issues of neural network controllers in reinforcement learning, revealing persistent low-return solutions and vulnerabilities, and introduces methods to analyze and prove their existence.
Contribution
It highlights the prevalence of persistent low-return solutions in neural controllers and provides algorithms and proofs for their systematic robustness analysis.
Findings
Neural controllers often have many persistent low-return solutions.
Simpler controllers tend to have more persistent bad solutions.
The paper introduces a computer-assisted proof methodology for robustness analysis.
Abstract
We raise concerns about controllers' robustness in simple reinforcement learning benchmark problems. We focus on neural network controllers and their low neuron and symbolic abstractions. A typical controller reaching high mean return values still generates an abundance of persistent low-return solutions, which is a highly undesirable property, easily exploitable by an adversary. We find that the simpler controllers admit more persistent bad solutions. We provide an algorithm for a systematic robustness study and prove existence of persistent solutions and, in some cases, periodic orbits, using a computer-assisted proof methodology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Memory and Neural Computing
MethodsFocus
