Data Sanity Check for Deep Learning Systems via Learnt Assertions
Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang

TL;DR
This paper introduces an assertion-based data sanity check tool for deep learning systems that automatically detects invalid inputs by analyzing data flow footprints, thereby improving system reliability.
Contribution
It presents a novel automated assertion generation method for data flow analysis to identify invalid inputs in deep learning systems.
Findings
Effective in detecting invalid inputs in real-world scenarios
Automatically generates assertions tailored for DL data flow
Enhances reliability of DL-based systems
Abstract
Reliability is a critical consideration to DL-based systems. But the statistical nature of DL makes it quite vulnerable to invalid inputs, i.e., those cases that are not considered in the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to enhance the reliability of DL-based systems. We design and implement a tool to detect behavior deviation of a DL model when processing an input case. This tool extracts the data flow footprints and conducts an assertion-based validation mechanism. The assertions are built automatically, which are specifically-tailored for DL model data flow analysis. Our experiments conducted with real-world scenarios demonstrate that such an assertion-based data sanity check mechanism is effective in identifying invalid input cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Digital and Cyber Forensics
