Dos and Don'ts of Machine Learning in Computer Security

Daniel Arp; Erwin Quiring; Feargus Pendlebury; Alexander Warnecke,; Fabio Pierazzi; Christian Wressnegger; Lorenzo Cavallaro; Konrad Rieck

arXiv:2010.09470·cs.CR·December 1, 2021·24 cites

Dos and Don'ts of Machine Learning in Computer Security

Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke,, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, Konrad Rieck

PDF

Open Access

TL;DR

This paper critically examines the use of machine learning in computer security, highlighting common pitfalls, analyzing recent literature, and offering practical recommendations to improve research quality and application reliability.

Contribution

It identifies widespread pitfalls in security-focused machine learning research, empirically demonstrates their impact, and proposes actionable guidelines to enhance future work.

Findings

01

Pitfalls are common in security ML research

02

Individual pitfalls can lead to unrealistic performance claims

03

Recommendations can help mitigate identified issues

Abstract

With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Software Engineering Research