Standardized Evaluation of Machine Learning Methods for Evolving Data   Streams

Johannes Haug; Effi Tramountani; Gjergji Kasneci

arXiv:2204.13625·cs.LG·April 29, 2022·1 cites

Standardized Evaluation of Machine Learning Methods for Evolving Data Streams

Johannes Haug, Effi Tramountani, Gjergji Kasneci

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive framework and standards for evaluating online machine learning methods on evolving data streams, emphasizing realistic, reliable, and interpretable assessments.

Contribution

It proposes standardized properties, evaluation strategies, and a modular Python framework called float for consistent assessment of online learning methods.

Findings

01

Defines key properties for high-quality online learning evaluation

02

Introduces the float framework for standardized testing

03

Enhances comparability and reliability of online ML evaluations

Abstract

Due to the unspecified and dynamic nature of data streams, online machine learning requires powerful and flexible solutions. However, evaluating online machine learning methods under realistic conditions is difficult. Existing work therefore often draws on different heuristics and simulations that do not necessarily produce meaningful and reliable results. Indeed, in the absence of common evaluation standards, it often remains unclear how online learning methods will perform in practice or in comparison to similar work. In this paper, we propose a comprehensive set of properties for high-quality machine learning in evolving data streams. In particular, we discuss sensible performance measures and evaluation strategies for online predictive modelling, online feature selection and concept drift detection. As one of the first works, we also look at the interpretability of online learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haugjo/float
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research

MethodsFeature Selection