Data Sanity Check for Deep Learning Systems via Learnt Assertions

Haochuan Lu; Huanlin Xu; Nana Liu; Yangfan Zhou; Xin Wang

arXiv:1909.03835·cs.LG·October 1, 2019·5 cites

Data Sanity Check for Deep Learning Systems via Learnt Assertions

Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang

PDF

Open Access

TL;DR

This paper introduces an assertion-based data sanity check tool for deep learning systems that automatically detects invalid inputs by analyzing data flow footprints, thereby improving system reliability.

Contribution

It presents a novel automated assertion generation method for data flow analysis to identify invalid inputs in deep learning systems.

Findings

01

Effective in detecting invalid inputs in real-world scenarios

02

Automatically generates assertions tailored for DL data flow

03

Enhances reliability of DL-based systems

Abstract

Reliability is a critical consideration to DL-based systems. But the statistical nature of DL makes it quite vulnerable to invalid inputs, i.e., those cases that are not considered in the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to enhance the reliability of DL-based systems. We design and implement a tool to detect behavior deviation of a DL model when processing an input case. This tool extracts the data flow footprints and conducts an assertion-based validation mechanism. The assertions are built automatically, which are specifically-tailored for DL model data flow analysis. Our experiments conducted with real-world scenarios demonstrate that such an assertion-based data sanity check mechanism is effective in identifying invalid input cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Digital and Cyber Forensics