DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems
Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li,, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, Yadong Wang

TL;DR
DeepGauge introduces multi-granularity testing criteria to evaluate and improve the robustness and generality of deep learning systems, addressing their vulnerabilities beyond mere accuracy metrics.
Contribution
It proposes a novel set of testing criteria for DL systems that provide a multi-faceted evaluation of their robustness and generality.
Findings
Effective in revealing vulnerabilities of DL systems
Applicable across multiple datasets and models
Enhances understanding of DL system robustness
Abstract
Deep learning (DL) defines a new data-driven programming paradigm that constructs the internal system logic of a crafted neuron network through a set of training data. We have seen wide adoption of DL in many safety-critical scenarios. However, a plethora of studies have shown that the state-of-the-art DL systems suffer from various vulnerabilities which can lead to severe consequences when applied to real-world applications. Currently, the testing adequacy of a DL system is usually measured by the accuracy of test data. Considering the limitation of accessible high quality test data, good accuracy performance on test data can hardly provide confidence to the testing adequacy and generality of DL systems. Unlike traditional software systems that have clear and controllable logic and functionality, the lack of interpretability in a DL system makes system analysis and defect detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsInterpretability
