Enhancing Masked Time-Series Modeling via Dropping Patches

Tianyu Qiu; Yi Xie; Yun Xiong; Hao Niu; Xiaofeng Gao

arXiv:2412.15315·stat.ML·December 23, 2024

Enhancing Masked Time-Series Modeling via Dropping Patches

Tianyu Qiu, Yi Xie, Yun Xiong, Hao Niu, Xiaofeng Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces DropPatch, a method that enhances masked time-series modeling by randomly dropping subsequence patches, improving training efficiency and performance across various scenarios through empirical and theoretical analysis.

Contribution

It proposes DropPatch, a novel patch-dropping technique that boosts pre-training efficiency and model robustness in masked time-series modeling.

Findings

01

DropPatch improves pre-training efficiency significantly.

02

It enhances model performance in in-domain, cross-domain, and few-shot scenarios.

03

Theoretically, it prevents representation collapse in Transformers.

Abstract

This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. On this basis, a simple yet effective method named DropPatch is proposed, which has two remarkable advantages: 1) It improves the pre-training efficiency by a square-level advantage; 2) It provides additional advantages for modeling in scenarios such as in-domain, cross-domain, few-shot learning and cold start. This paper conducts comprehensive experiments to verify the effectiveness of the method and analyze its internal mechanism. Empirically, DropPatch strengthens the attention mechanism, reduces information redundancy and serves as an efficient means of data augmentation. Theoretically, it is proved that DropPatch slows down the rate at which the Transformer representations collapse into the rank-1 linear subspace by randomly dropping patches, thus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qityy/droppatch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Time Series Analysis and Forecasting

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Adam