Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information
Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet, Russel Pears

TL;DR
This paper introduces FiCSUM, a flexible framework that uses a diverse set of meta-information features to improve concept representation and drift detection in data streams, outperforming existing methods.
Contribution
FiCSUM provides a novel, dynamic weighting approach to combine multiple meta-information features for more accurate concept representation and drift detection in data streams.
Findings
FiCSUM outperforms state-of-the-art methods in accuracy.
It effectively models underlying concept drift.
The framework works well across diverse datasets.
Abstract
Streaming sources of data are becoming more common as the ability to collect data in real-time grows. A major concern in dealing with data streams is concept drift, a change in the distribution of data over time, for example, due to changes in environmental conditions. Representing concepts (stationary periods featuring similar behaviour) is a key idea in adapting to concept drift. By testing the similarity of a concept representation to a window of observations, we can detect concept drift to a new or previously seen recurring concept. Concept representations are constructed using meta-information features, values describing aspects of concept behaviour. We find that previously proposed concept representations rely on small numbers of meta-information features. These representations often cannot distinguish concepts, leaving systems vulnerable to concept drift. We propose FiCSUM, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Innovative Microfluidic and Catalytic Techniques Innovation · Time Series Analysis and Forecasting
