Parametric Augmentation for Time Series Contrastive Learning

Xu Zheng; Tianchun Wang; Wei Cheng; Aitian Ma; Haifeng Chen; Mo Sha,; Dongsheng Luo

arXiv:2402.10434·cs.LG·February 19, 2024·1 cites

Parametric Augmentation for Time Series Contrastive Learning

Xu Zheng, Tianchun Wang, Wei Cheng, Aitian Ma, Haifeng Chen, Mo Sha,, Dongsheng Luo

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces AutoTCL, a parametric augmentation framework for time series contrastive learning that adaptively enhances representation learning, leading to improved forecasting and classification performance.

Contribution

It proposes a novel adaptive augmentation method for time series contrastive learning, addressing the challenge of selecting meaningful augmentations without visual inspection.

Findings

01

Achieves 6.5% reduction in MSE on forecasting tasks

02

Attains 4.7% reduction in MAE on forecasting tasks

03

Increases average classification accuracy by 1.2%

Abstract

Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist the model in learning robust and discriminative representations is a crucial stage in contrastive learning approaches. Usually, preset human intuition directs the selection of relevant data augmentations. Due to patterns that are easily recognized by humans, this rule of thumb works well in the vision and language domains. However, it is impractical to visually inspect the temporal structures in time series. The diversity of time series augmentations at both the dataset and instance levels makes it difficult to choose meaningful augmentations on the fly. In this study, we address this gap by analyzing time series data augmentation using information theory and summarizing the most…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 8· accept, good paperConfidence 4

Strengths

The paper has the following strenghts: - It is clear, the writing is good. - The empirical analysis seems sound. From table 4, there are benefits to the approach compared to relevant baselines and ablated versions of the model. - The idea is proposed in a principled, justified way.

Weaknesses

The paper's weaknesses are: - Tables 6 and 7 along with the tables from the main paper seem to showcase marginal improvements over CoST, which was the architectural basis of their approach. In particular it is hard to determine if the difference is statistically significant. - Most of the comparison in table 6/7 is less relevant than the ablation study. The reason is the following: the authors are comparing different architectures. Most of these perform less well than CoST, and they are buildin

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

- The proposed approach effectively deals with the data augmentation problem by unifying various methods into a comprehensive framework through the utilization of information theory. - Not only has the effectiveness of the framework been theoretically proven from an information theory perspective, but it has also been extensively validated through empirical experiments.

Weaknesses

- There might be some minor inconsistencies or gaps in the proof that require further attention to ensure its rigor. Such as, in the proof of Property 1. An invertible mapping is not necessarily a one-to-one mapping, which depending on the domain. Of course, this does not affect the subsequent proof. - Some errors on formatting：page 6 “as random timestamp masking“, it seems unnecessary to bold it. - Font size of some tables is too low.

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

1. The study is well-grounded in theory concerning its motivation and summarizes the theoretical conditions that the " GOOD VIEWS" of contrastive learning should meet. 2. The proposed method exhibits outstanding performance in experimental results.

Weaknesses

The paper emphasizes the use of a parametric module to decompose time series into an informative part and a task-irrelevant part and perform parameter transformation on the informative part to obtain an enhanced view. However, it is not sufficiently clear. 1. What role does g play specifically? Especially in the analysis of the ablation experiments, I only observed differences in the results; 2. It's ambiguous why h is able to focus on the informative part of the sequence. The author should

Code & Models

Repositories

AslanDing/AutoTCL
pytorchOfficial

Videos

Parametric Augmentation for Time Series Contrastive Learning· slideslive

Taxonomy

TopicsNeural Networks and Applications · Time Series Analysis and Forecasting · Face and Expression Recognition

MethodsContrastive Learning · Masked autoencoder