DANDI: Diffusion as Normative Distribution for Deep Neural Network Input
Somin Kim, Shin Yoo

TL;DR
DANDI introduces a method to estimate Surprise Adequacy for DNNs using synthetic data generated by Stable Diffusion, eliminating the need for original training data while maintaining high correlation and effectiveness in test prioritization.
Contribution
DANDI provides a novel approach to compute Surprise Adequacy without access to training data by leveraging synthetic data from Stable Diffusion, enhancing practical DNN testing.
Findings
SA values from DANDI strongly correlate with those from real training data
DANDI's synthetic data-based SA effectively prioritizes inputs for DNN testing
Method works well on CIFAR10 and ImageNet-1K datasets
Abstract
Surprise Adequacy (SA) has been widely studied as a test adequacy metric that can effectively guide software engineers towards inputs that are more likely to reveal unexpected behaviour of Deep Neural Networks (DNNs). Intuitively, SA is an out-of-distribution metric that quantifies the dissimilarity between the given input and the training data: if a new input is very different from those seen during training, the DNN is more likely to behave unexpectedly against the input. While SA has been widely adopted as a test prioritization method, its major weakness is the fact that the computation of the metric requires access to the training dataset, which is often not allowed in real-world use cases. We present DANDI, a technique that generates a surrogate input distribution using Stable Diffusion to compute SA values without requiring the original training data. An empirical evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsDiffusion
