Automated Repair of Neural Networks
Dor Cohen, Ofer Strichman

TL;DR
This paper presents a framework that repairs unsafe neural networks by minimally adjusting weights to satisfy safety specifications, using SMT solvers, thereby enhancing robustness against adversarial attacks with minimal accuracy loss.
Contribution
Introduces a novel SMT-based method for automatically repairing neural networks to meet safety properties while maintaining decision boundary similarity.
Findings
Successfully repairs neural networks with up to hundreds of parameters.
Achieves improved adversarial robustness with only mild accuracy loss.
Outperforms naive baseline methods in empirical evaluations.
Abstract
Over the last decade, Neural Networks (NNs) have been widely used in numerous applications including safety-critical ones such as autonomous systems. Despite their emerging adoption, it is well known that NNs are susceptible to Adversarial Attacks. Hence, it is highly important to provide guarantees that such systems work correctly. To remedy these issues we introduce a framework for repairing unsafe NNs w.r.t. safety specification, that is by utilizing satisfiability modulo theories (SMT) solvers. Our method is able to search for a new, safe NN representation, by modifying only a few of its weight values. In addition, our technique attempts to maximize the similarity to original network with regard to its decision boundaries. We perform extensive experiments which demonstrate the capability of our proposed framework to yield safe NNs w.r.t. the Adversarial Robustness property, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
MethodsRepair
