Lightweight Automated Feature Monitoring for Data Streams
Jo\~ao Conde, Ricardo Moreira, Jo\~ao Torres, Pedro Cardoso, Hugo R.C., Ferreira, Marco O.P. Sampaio, Jo\~ao Tiago Ascens\~ao, Pedro Bizarro

TL;DR
This paper introduces a lightweight, data-driven feature monitoring system for real-time data streams that detects data drifts with minimal memory and computational resources, aiding in root cause analysis.
Contribution
The paper presents a novel, efficient feature monitoring method using multivariate statistical tests and exponential moving histograms for drift detection in streaming data.
Findings
Detects data drifts with low memory and computational cost
Provides interpretable feature rankings during alarms
Identifies complex problems not tied to single features
Abstract
Monitoring the behavior of automated real-time stream processing systems has become one of the most relevant problems in real world applications. Such systems have grown in complexity relying heavily on high dimensional input data, and data hungry Machine Learning (ML) algorithms. We propose a flexible system, Feature Monitoring (FM), that detects data drifts in such data sets, with a small and constant memory footprint and a small computational cost in streaming applications. The method is based on a multi-variate statistical test and is data driven by design (full reference distributions are estimated from the data). It monitors all features that are used by the system, while providing an interpretable features ranking whenever an alarm occurs (to aid in root cause analysis). The computational and memory lightness of the system results from the use of Exponential Moving Histograms. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Time Series Analysis and Forecasting · Anomaly Detection Techniques and Applications
MethodsTest
