Parametric Augmentation for Time Series Contrastive Learning
Xu Zheng, Tianchun Wang, Wei Cheng, Aitian Ma, Haifeng Chen, Mo Sha,, Dongsheng Luo

TL;DR
This paper introduces AutoTCL, a parametric augmentation framework for time series contrastive learning that adaptively enhances representation learning, leading to improved forecasting and classification performance.
Contribution
It proposes a novel adaptive augmentation method for time series contrastive learning, addressing the challenge of selecting meaningful augmentations without visual inspection.
Findings
Achieves 6.5% reduction in MSE on forecasting tasks
Attains 4.7% reduction in MAE on forecasting tasks
Increases average classification accuracy by 1.2%
Abstract
Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist the model in learning robust and discriminative representations is a crucial stage in contrastive learning approaches. Usually, preset human intuition directs the selection of relevant data augmentations. Due to patterns that are easily recognized by humans, this rule of thumb works well in the vision and language domains. However, it is impractical to visually inspect the temporal structures in time series. The diversity of time series augmentations at both the dataset and instance levels makes it difficult to choose meaningful augmentations on the fly. In this study, we address this gap by analyzing time series data augmentation using information theory and summarizing the most…
Peer Reviews
Decision·ICLR 2024 poster
The paper has the following strenghts: - It is clear, the writing is good. - The empirical analysis seems sound. From table 4, there are benefits to the approach compared to relevant baselines and ablated versions of the model. - The idea is proposed in a principled, justified way.
The paper's weaknesses are: - Tables 6 and 7 along with the tables from the main paper seem to showcase marginal improvements over CoST, which was the architectural basis of their approach. In particular it is hard to determine if the difference is statistically significant. - Most of the comparison in table 6/7 is less relevant than the ablation study. The reason is the following: the authors are comparing different architectures. Most of these perform less well than CoST, and they are buildin
- The proposed approach effectively deals with the data augmentation problem by unifying various methods into a comprehensive framework through the utilization of information theory. - Not only has the effectiveness of the framework been theoretically proven from an information theory perspective, but it has also been extensively validated through empirical experiments.
- There might be some minor inconsistencies or gaps in the proof that require further attention to ensure its rigor. Such as, in the proof of Property 1. An invertible mapping is not necessarily a one-to-one mapping, which depending on the domain. Of course, this does not affect the subsequent proof. - Some errors on formatting:page 6 “as random timestamp masking“, it seems unnecessary to bold it. - Font size of some tables is too low.
1. The study is well-grounded in theory concerning its motivation and summarizes the theoretical conditions that the " GOOD VIEWS" of contrastive learning should meet. 2. The proposed method exhibits outstanding performance in experimental results.
The paper emphasizes the use of a parametric module to decompose time series into an informative part and a task-irrelevant part and perform parameter transformation on the informative part to obtain an enhanced view. However, it is not sufficiently clear. 1. What role does g play specifically? Especially in the analysis of the ablation experiments, I only observed differences in the results; 2. It's ambiguous why h is able to focus on the informative part of the sequence. The author should
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Time Series Analysis and Forecasting · Face and Expression Recognition
MethodsContrastive Learning · Masked autoencoder
