Dos and Don'ts of Machine Learning in Computer Security
Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke,, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, Konrad Rieck

TL;DR
This paper critically examines the use of machine learning in computer security, highlighting common pitfalls, analyzing recent literature, and offering practical recommendations to improve research quality and application reliability.
Contribution
It identifies widespread pitfalls in security-focused machine learning research, empirically demonstrates their impact, and proposes actionable guidelines to enhance future work.
Findings
Pitfalls are common in security ML research
Individual pitfalls can lead to unrealistic performance claims
Recommendations can help mitigate identified issues
Abstract
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Software Engineering Research
