CADM: Confusion Model-based Detection Method for Real-drift in Chunk Data Stream
Songqiao Hu, Zeyi Liu, Xiao He

TL;DR
This paper introduces CADM, a novel method for detecting real concept drift in chunk data streams by leveraging confusion between real and pseudo labels, using cosine similarity and adaptive thresholds to improve detection accuracy.
Contribution
The paper proposes a new confusion-based approach for real-drift detection in data streams, addressing limitations of existing virtual drift detection methods.
Findings
Low false alarm rate in drift detection
Effective across different classifiers
Accurate detection of real concept drift
Abstract
Concept drift detection has attracted considerable attention due to its importance in many real-world applications such as health monitoring and fault diagnosis. Conventionally, most advanced approaches will be of poor performance when the evaluation criteria of the environment has changed (i.e. concept drift), either can only detect and adapt to virtual drift. In this paper, we propose a new approach to detect real-drift in the chunk data stream with limited annotations based on concept confusion. When a new data chunk arrives, we use both real labels and pseudo labels to update the model after prediction and drift detection. In this context, the model will be confused and yields prediction difference once drift occurs. We then adopt cosine similarity to measure the difference. And an adaptive threshold method is proposed to find the abnormal value. Experiments show that our method has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Network Security and Intrusion Detection · Air Quality Monitoring and Forecasting
