A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation
Vennela Yarabolu, Govind Waghmare, Sonia Gupta, Siddhartha Asthana

TL;DR
This paper presents a scalable data segmentation framework that effectively manages both covariate and concept drift in machine learning, enhancing accuracy and operational efficiency in large-scale deployments.
Contribution
It introduces a novel data segmentation-based approach that adaptively manages multiple drift types, improving model robustness and reducing resource consumption.
Findings
Improves model accuracy on real-world datasets
Reduces operational costs and latency
Effectively handles multiple drift types
Abstract
In many real-world applications, continuous machine learning (ML) systems are crucial but prone to data drift, a phenomenon where discrepancies between historical training data and future test data lead to significant performance degradation and operational inefficiencies. Traditional drift adaptation methods typically update models using ensemble techniques, often discarding drifted historical data, and focus primarily on either covariate drift or concept drift. These methods face issues such as high resource demands, inability to manage all types of drifts effectively, and neglecting the valuable context that historical data can provide. We contend that explicitly incorporating drifted data into the model training process significantly enhances model accuracy and robustness. This paper introduces an advanced framework that integrates the strengths of data-centric approaches with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
