Bootstrapped Control Limits for Score-Based Concept Drift Control Charts
Jiezhong Wu, Daniel W. Apley

TL;DR
This paper introduces a bootstrap-based method to improve control limit calibration for score-based concept drift detection, enabling more accurate and efficient monitoring without large holdout sets.
Contribution
It develops a nested bootstrap procedure for calibrating control limits that uses the full initial dataset and corrects for bootstrap bias, enhancing concept drift detection accuracy.
Findings
Bootstrap calibration improves control limit accuracy.
Method reduces need for large holdout sets.
Computational efficiency comparable to existing methods.
Abstract
Monitoring for changes in a predictive relationship represented by a fitted supervised learning model (i.e., concept drift detection) is a widespread problem in modern data-driven applications. A general and powerful Fisher score-based concept drift approach was recently proposed, in which detecting concept drift reduces to detecting changes in the mean of the model's score vector using a multivariate exponentially weighted moving average (MEWMA). To implement the approach, the initial data must be split into two subsets. The first subset serves as the training sample to which the model is fit, and the second subset serves as an out-of-sample test set from which the MEWMA control limit (CL) is determined. In this paper, we retain the same score-based MEWMA monitoring statistic as the existing method and focus instead on improving the computation of the control limit. We develop a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques
