DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks
Divya Gopinath, Guy Katz, Corina S. Pasareanu, Clark Barrett

TL;DR
DeepSafe introduces a data-driven method combining clustering and verification to identify safe input regions and assess adversarial robustness in neural networks, addressing limitations of existing pointwise checks.
Contribution
It presents a novel approach that automatically finds and verifies safe regions in input space, including the concept of targeted robustness, for neural networks.
Findings
Identified multiple safe regions in MNIST and ACAS Xu networks.
Discovered adversarial perturbations within the evaluated networks.
Confirmed the effectiveness of the approach in real-world neural network models.
Abstract
Deep neural networks have become widely used, obtaining remarkable results in domains such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bio-informatics, where they have produced results comparable to human experts. However, these networks can be easily fooled by adversarial perturbations: minimal changes to correctly-classified inputs, that cause the network to mis-classify them. This phenomenon represents a concern for both safety and security, but it is currently unclear how to measure a network's robustness against such perturbations. Existing techniques are limited to checking robustness around a few individual input points, providing only very limited guarantees. We propose a novel approach for automatically identifying safe regions of the input space, within which the network is robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
