TL;DR
This paper evaluates the benefits of self-supervised learning for time series, comparing generative and latent approaches, and finds that SSL provides significant gains for some tasks but not others, influenced by task-specific signal resolution needs.
Contribution
It introduces a controlled framework for quantifying the pre-training dividend in time series SSL and compares generative and latent methods with novel adaptations.
Findings
SSL yields up to 375% improvements in anomaly detection and classification.
Representation utility depends on task-specific signal resolution requirements.
Representation quality saturates at moderate model depths, independent of data source.
Abstract
The success of self-supervised learning (SSL) in vision and NLP has motivated its rapid adoption for time series. However, research has focused primarily on Generative paradigms and forecasting tasks, leaving the broader utility of learned representations unquantified. We establish a controlled framework to evaluate the "pre-training dividend": the value added by SSL across diverse temporal tasks. We systematically compare Generative paradigms against Latent Alignment architectures, introducing adaptations of LeJEPA and DINO for time series. These adaptations utilize Discrete Wavelet Transform (DWT) augmentations to enforce invariance to local fluctuations. Our analysis reveals that the pre-training dividend is highly asymmetric: SSL yields gains of up to 375% for anomaly detection and classification, yet remains marginal for forecasting. We demonstrate that representational utility is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
