Machine Learning Assessment: implications to cybersecurity
Waleed A. Yousef

TL;DR
This paper reviews resampling methods for assessing machine learning performance, focusing on error rate and AUC estimation, with implications for cybersecurity applications involving structured data and traditional ML algorithms.
Contribution
It provides a theoretical framework for resampling techniques estimating error and AUC, highlighting their computational challenges and relevance to cybersecurity with structured data.
Findings
Resampling methods effectively estimate error and AUC without distribution knowledge.
Computational expense limits their use with deep neural networks.
Traditional ML algorithms are suitable for structured cybersecurity data.
Abstract
This chapter is dedicated to the assessment and performance estimation of machine learning (ML) algorithms, a topic that is equally important to the construction of these algorithms, in particular in the context of cyberphysical security design. The literature is full of nonparametric methods to estimate a statistic from just one available dataset through resampling techniques, e.g., jackknife, bootstrap and cross validation (CV). Special statistics of great interest are the error rate and the area under the ROC curve (AUC) of a classification rule. The importance of these resampling methods stems from the fact that they require no knowledge about the probability distribution of the data or the construction details of the ML algorithm. This chapter provides a concise review of this literature to establish a coherent theoretical framework for these methods that can estimate both the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
