Standardized Evaluation of Machine Learning Methods for Evolving Data Streams
Johannes Haug, Effi Tramountani, Gjergji Kasneci

TL;DR
This paper introduces a comprehensive framework and standards for evaluating online machine learning methods on evolving data streams, emphasizing realistic, reliable, and interpretable assessments.
Contribution
It proposes standardized properties, evaluation strategies, and a modular Python framework called float for consistent assessment of online learning methods.
Findings
Defines key properties for high-quality online learning evaluation
Introduces the float framework for standardized testing
Enhances comparability and reliability of online ML evaluations
Abstract
Due to the unspecified and dynamic nature of data streams, online machine learning requires powerful and flexible solutions. However, evaluating online machine learning methods under realistic conditions is difficult. Existing work therefore often draws on different heuristics and simulations that do not necessarily produce meaningful and reliable results. Indeed, in the absence of common evaluation standards, it often remains unclear how online learning methods will perform in practice or in comparison to similar work. In this paper, we propose a comprehensive set of properties for high-quality machine learning in evolving data streams. In particular, we discuss sensible performance measures and evaluation strategies for online predictive modelling, online feature selection and concept drift detection. As one of the first works, we also look at the interpretability of online learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research
MethodsFeature Selection
