TESSERACT: Eliminating Experimental Bias in Malware Classification   across Space and Time (Extended Version)

Zeliang Kan; Shae McFadden; Daniel Arp; Feargus Pendlebury; Roberto; Jordaney; Johannes Kinder; Fabio Pierazzi; Lorenzo Cavallaro

arXiv:2402.01359·cs.LG·April 10, 2025·1 cites

TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time (Extended Version)

Zeliang Kan, Shae McFadden, Daniel Arp, Feargus Pendlebury, Roberto, Jordaney, Johannes Kinder, Fabio Pierazzi, Lorenzo Cavallaro

PDF

Open Access

TL;DR

This paper introduces TESSERACT, a framework to eliminate spatial and temporal biases in malware classification experiments, enabling more realistic evaluation and improved robustness of ML models over time.

Contribution

It proposes constraints for fair experiment design, a new robustness metric AUT, and an algorithm for tuning training data, addressing biases in malware detection research.

Findings

01

Biases inflate previous performance results

02

Periodic tuning improves classifier stability

03

Mitigation strategies delay performance decay

Abstract

Machine learning (ML) plays a pivotal role in detecting malicious software. Despite the high F1-scores reported in numerous studies reaching upwards of 0.99, the issue is not completely solved. Malware detectors often experience performance decay due to constantly evolving operating systems and attack methods, which can render previously learned knowledge insufficient for accurate decision-making on new inputs. This paper argues that commonly reported results are inflated due to two pervasive sources of experimental bias in the detection task: spatial bias caused by data distributions that are not representative of a real-world deployment; and temporal bias caused by incorrect time splits of data, leading to unrealistic configurations. To address these biases, we introduce a set of constraints for fair experiment design, and propose a new metric, AUT, for classifier robustness in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications

MethodsSparse Evolutionary Training