Open-Source Drift Detection Tools in Action: Insights from Two Use Cases
Rieke M\"uller, Mohamed Abdelaal, Davor Stjelja

TL;DR
This paper evaluates open-source data drift detection tools using real-world smart building data, highlighting their strengths and limitations in practical ML deployment scenarios.
Contribution
It introduces D3Bench, a microbenchmark for assessing open-source drift detection tools, and provides comparative insights from two real-world use cases.
Findings
Evidently AI is best for general data drift detection.
NannyML excels at identifying shift timing and impact on accuracy.
Tools vary in usability and computational efficiency.
Abstract
Data drifts pose a critical challenge in the lifecycle of machine learning (ML) models, affecting their performance and reliability. In response to this challenge, we present a microbenchmark study, called D3Bench, which evaluates the efficacy of open-source drift detection tools. D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.We prioritize assessing the functional suitability of these tools to identify and analyze data drifts. Furthermore, we consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands. Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Traffic Prediction and Management Techniques · Smart Grid Energy Management
MethodsSparse Evolutionary Training
