Beware of the Unexpected: Bimodal Taint Analysis
Yiu Wai Chow, Max Sch\"afer, Michael Pradel

TL;DR
This paper introduces Fluffy, a bimodal taint analysis combining static data flow reasoning with machine learning to better identify potentially problematic data flows in code, improving vulnerability detection accuracy.
Contribution
It presents a novel framework that integrates static analysis with machine learning to distinguish expected from unexpected taint flows, enhancing security vulnerability detection.
Findings
Achieves F1 score of 0.85+ on multiple vulnerability types
Successfully applied to 250,000 JavaScript projects
Demonstrates improved detection of unexpected data flows
Abstract
Static analysis is a powerful tool for detecting security vulnerabilities and other programming problems. Global taint tracking, in particular, can spot vulnerabilities arising from complicated data flow across multiple functions. However, precisely identifying which flows are problematic is challenging, and sometimes depends on factors beyond the reach of pure program analysis, such as conventions and informal knowledge. For example, learning that a parameter "name" of an API function "locale" ends up in a file path is surprising and potentially problematic. In contrast, it would be completely unsurprising to find that a parameter "command" passed to an API function "execaCommand" is eventually interpreted as part of an operating-system command. This paper presents Fluffy, a bimodal taint analysis that combines static analysis, which reasons about data flow, with machine learning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Web Application Security Vulnerabilities · Advanced Malware Detection Techniques
