The Influence of Multiple Classes on Learning Online Classifiers from Imbalanced and Concept Drifting Data Streams
Agnieszka Lipska, Jerzy Stefanowski

TL;DR
This study investigates how local data characteristics and concept drifts affect the performance of online classifiers on multi-class imbalanced data streams, highlighting the impact of class overlap, rare examples, and class drift scenarios.
Contribution
It introduces a categorization of data factors and drifts in imbalanced streams and evaluates classifier performance under various synthetic scenarios, revealing key challenges.
Findings
Overlapping minority classes significantly impact learning difficulty.
Rare examples are the most challenging factor for classifiers.
Complex drift scenarios worsen classifier evaluation metrics.
Abstract
This work is aimed at the experimental studying the influence of local data characteristics and drifts on the difficulties of learning various online classifiers from multi-class imbalanced data streams. Firstly we present a categorization of these data factors and drifts in the context of imbalanced streams, then we introduce the generators of synthetic streams that model these factors and drifts. The results of many experiments with synthetically generated data streams have shown a much greater role of the overlapping between many minority classes (the type of borderline examples) than for streams with one minority class. The presence of rare examples in the stream is the most difficult single factor. The local drift of splitting minority classes is the third influential factor. Unlike binary streams, the specialized UOB and OOB classifiers perform well enough for even high imbalance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Network Security and Intrusion Detection · Smart Grid Energy Management
