Sketches for Time-Dependent Machine Learning
Jesus Antonanzas, Marta Arias, Albert Bifet

TL;DR
This paper introduces a method to improve time series machine learning models by maintaining and utilizing evolving statistical summaries of data, enhancing prediction accuracy without significant computational costs.
Contribution
It presents a novel approach to incorporate time-dependent data summaries into models, improving their adaptability to changing data distributions.
Findings
Significant performance improvements in classification tasks.
Efficient maintenance of statistical summaries with minimal overhead.
Enhanced model robustness to data distribution shifts.
Abstract
Time series data can be subject to changes in the underlying process that generates them and, because of these changes, models built on old samples can become obsolete or perform poorly. In this work, we present a way to incorporate information about the current data distribution and its evolution across time into machine learning algorithms. Our solution is based on efficiently maintaining statistics, particularly the mean and the variance, of data features at different time resolutions. These data summarisations can be performed over the input attributes, in which case they can then be fed into the model as additional input features, or over latent representations learned by models, such as those of Recurrent Neural Networks. In classification tasks, the proposed techniques can significantly outperform the prediction capabilities of equivalent architectures with no feature / latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
