Temporal Taskification in Streaming Continual Learning: A Source of Evaluation Instability
Nicolae Filat, Ahmed Hussain, Konstantinos Kalogiannis, Elena Burceanu

TL;DR
This paper demonstrates that the way a data stream is divided into tasks significantly impacts the evaluation outcomes in streaming continual learning, highlighting the importance of considering taskification as a core evaluation factor.
Contribution
The authors introduce a framework to analyze the effects of temporal taskification on continual learning evaluation, revealing its influence on performance metrics and evaluation stability.
Findings
Different task splits lead to substantial variations in forecasting error and forgetting.
Shorter taskifications cause noisier patterns and higher sensitivity to boundary changes.
Evaluation results depend on taskification, not just data and model.
Abstract
Streaming Continual Learning (CL) typically converts a continuous stream into a sequence of discrete tasks through temporal partitioning. We argue that this temporal taskification step is not a neutral preprocessing choice, but a structural component of evaluation: different valid splits of the same stream can induce different CL regimes and therefore different benchmark conclusions. To study this effect, we introduce a taskification-level framework based on plasticity and stability profiles, a profile distance between taskifications, and Boundary-Profile Sensitivity (BPS), which diagnoses how strongly small boundary perturbations alter the induced regime before any CL model is trained. We evaluate continual finetuning, Experience Replay, Elastic Weight Consolidation, and Learning without Forgetting on network traffic forecasting with CESNET-Timeseries24, keeping the stream, model, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
