Statically Detecting Vulnerabilities by Processing Programming Languages as Natural Languages
Ib\'eria Medeiros (1), Nuno Neves (1), Miguel Correia (2) ((1) LASIGE,, Faculdade de Ci\^encias, Universidade de Lisboa, Portugal, (2) INESC-ID,, Instituto Superior T\'ecnico, Universidade de Lisboa, Portugal)

TL;DR
This paper introduces a novel AI-based static analysis approach that uses natural language processing techniques to automatically detect vulnerabilities in web application source code, demonstrated through the DEKANT tool on PHP and WordPress plugins.
Contribution
It presents a new method employing NLP sequence models for vulnerability detection, reducing the need for manual programming of detection rules.
Findings
Detected several hundred vulnerabilities, including 62 zero-day flaws.
Successfully applied to a large set of PHP applications and WordPress plugins.
Identified 12 classes of input validation vulnerabilities.
Abstract
Web applications continue to be a favorite target for hackers due to a combination of wide adoption and rapid deployment cycles, which often lead to the introduction of high impact vulnerabilities. Static analysis tools are important to search for bugs automatically in the program source code, supporting developers on their removal. However, building these tools requires programming the knowledge on how to discover the vulnerabilities. This paper presents an alternative approach in which tools learn to detect flaws automatically by resorting to artificial intelligence concepts, more concretely to natural language processing. The approach employs a sequence model to learn to characterize vulnerabilities based on an annotated corpus. Afterwards, the model is utilized to discover and identify vulnerabilities in the source code. It was implemented in the DEKANT tool and evaluated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
