Quilt: Robust Data Segment Selection against Concept Drifts
Minsu Kim, Seong-Hyeon Hwang, Steven Euijong Whang

TL;DR
Quilt is a data-centric framework that improves model accuracy in streaming data with concept drifts by intelligently selecting and discarding data segments, outperforming existing methods in accuracy and efficiency.
Contribution
It introduces Quilt, a novel data segment selection approach that explicitly utilizes drifted data and extends subset selection techniques for better accuracy and efficiency in concept drift scenarios.
Findings
Outperforms state-of-the-art drift adaptation methods
Effectively discards drifted data segments
Maintains high model accuracy with reduced training data
Abstract
Continuous machine learning pipelines are common in industrial settings where models are periodically trained on data streams. Unfortunately, concept drifts may occur in data streams where the joint distribution of the data X and label y, P(X, y), changes over time and possibly degrade model accuracy. Existing concept drift adaptation approaches mostly focus on updating the model to the new data possibly using ensemble techniques of previous models and tend to discard the drifted historical data. However, we contend that explicitly utilizing the drifted data together leads to much better model accuracy and propose Quilt, a data-centric framework for identifying and selecting data segments that maximize model accuracy. To address the potential downside of efficiency, Quilt extends existing data subset selection techniques, which can be used to reduce the training data without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Air Quality Monitoring and Forecasting
MethodsFocus
