SPARK: Static Program Analysis Reasoning and Retrieving Knowledge
Wasuwee Sodsong, Bernhard Scholz, Sanjay Chawla

TL;DR
This paper introduces a machine learning pipeline that induces symbolic security analysis rules for programs, enabling automated reasoning about program security without execution, demonstrated on large real-world codebases.
Contribution
It presents a novel two-stage pipeline combining RNNs and symbolic rule extraction for program security analysis, with a new similarity measure for infinite language comparison.
Findings
Feasible to deduce security rules for large codebases like OpenJDK
Requires sufficient training data and balanced program path distribution
Effective in generating symbolic security rules from real-world data
Abstract
Program analysis is a technique to reason about programs without executing them, and it has various applications in compilers, integrated development environments, and security. In this work, we present a machine learning pipeline that induces a security analyzer for programs by example. The security analyzer determines whether a program is either secure or insecure based on symbolic rules that were deduced by our machine learning pipeline. The machine pipeline is two-staged consisting of a Recurrent Neural Networks (RNN) and an Extractor that converts an RNN to symbolic rules. To evaluate the quality of the learned symbolic rules, we propose a sampling-based similarity measurement between two infinite regular languages. We conduct a case study using real-world data. In this work, we discuss the limitations of existing techniques and possible improvements in the future. The results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Advanced Malware Detection Techniques
