Statistical Estimation of Malware Detection Metrics in the Absence of Ground Truth
Pang Du, Zheyuan Sun, Huashan Chen, Jin-Hee Cho, Shouhuai, Xu

TL;DR
This paper develops and validates statistical estimators to accurately measure malware detection metrics without relying on ground truth, addressing a critical challenge in cybersecurity evaluation.
Contribution
It introduces novel statistical estimators for malware detection metrics in the absence of ground truth and analyzes their properties for improved measurement accuracy.
Findings
Estimators perform well on synthetic data with known ground truth.
Estimators provide meaningful metrics on real VirusTotal data.
Adjustments improve estimator accuracy under certain conditions.
Abstract
The accurate measurement of security metrics is a critical research problem because an improper or inaccurate measurement process can ruin the usefulness of the metrics, no matter how well they are defined. This is a highly challenging problem particularly when the ground truth is unknown or noisy. In contrast to the well perceived importance of defining security metrics, the measurement of security metrics has been little understood in the literature. In this paper, we measure five malware detection metrics in the {\em absence} of ground truth, which is a realistic setting that imposes many technical challenges. The ultimate goal is to develop principled, automated methods for measuring these metrics at the maximum accuracy possible. The problem naturally calls for investigations into statistical estimators by casting the measurement problem as a {\em statistical estimation} problem.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
