Towards security defect prediction with AI
Carson D. Sestili, William S. Snavely, Nathan M. VanHoudnos

TL;DR
This paper compares AI-based and static analysis tools for buffer overflow detection, highlighting their strengths and limitations, and suggests future research directions involving code representations and deep learning techniques.
Contribution
It introduces a controlled code dataset and evaluates the performance of AI systems versus static analysis tools for security defect detection.
Findings
Static analysis tools have high precision but poor recall on the dataset.
AI systems can match static analysis performance but need extensive training data.
Sound static analyzers perform well in both precision and recall.
Abstract
In this study, we investigate the limits of the current state of the art AI system for detecting buffer overflows and compare it with current static analysis tools. To do so, we developed a code generator, s-bAbI, capable of producing an arbitrarily large number of code samples of controlled complexity. We found that the static analysis engines we examined have good precision, but poor recall on this dataset, except for a sound static analyzer that has good precision and recall. We found that the state of the art AI system, a memory network modeled after Choi et al. [1], can achieve similar performance to the static analysis engines, but requires an exhaustive amount of training data in order to do so. Our work points towards future approaches that may solve these problems; namely, using representations of code that can capture appropriate scope information and using deep learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
