Unveiling Statistical Significance of Online Regression over Multiple Datasets
Mohammad Abu-Shaira, Weishi Shi

TL;DR
This paper investigates statistical significance testing for online regression models across multiple datasets, emphasizing the importance of robust methods for validating performance in evolving data environments.
Contribution
It evaluates and applies the Friedman test and post-hoc analyses to compare online regression models across datasets, filling a gap in statistical validation for online learning.
Findings
Friedman test confirms performance of competitive models.
Some methods show room for improvement based on statistical tests.
Evaluation includes real and synthetic datasets with cross-validation.
Abstract
Despite extensive focus on techniques for evaluating the performance of two learning algorithms on a single dataset, the critical challenge of developing statistical tests to compare multiple algorithms across various datasets has been largely overlooked in most machine learning research. Additionally, in the realm of Online Learning, ensuring statistical significance is essential to validate continuous learning processes, particularly for achieving rapid convergence and effectively managing concept drifts in a timely manner. Robust statistical methods are needed to assess the significance of performance differences as data evolves over time. This article examines the state-of-the-art online regression models and empirically evaluates several suitable tests. To compare multiple online regression models across various datasets, we employed the Friedman test along with corresponding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
