Dataset-Driven Channel Masks in Transformers for Multivariate Time Series

Seunghan Lee; Taeyoung Park; Kibok Lee

arXiv:2410.23222·cs.LG·May 7, 2026

Dataset-Driven Channel Masks in Transformers for Multivariate Time Series

Seunghan Lee, Taeyoung Park, Kibok Lee

PDF

1 Repo

TL;DR

This paper introduces dataset-specific channel masks in Transformer models to better capture channel dependencies in multivariate time series, improving modeling accuracy across diverse datasets.

Contribution

It proposes a novel dataset-driven approach using channel masks that refine attention mechanisms in Transformers for multivariate time series analysis.

Findings

01

Channel masks improve dependency modeling in multivariate TS.

02

The approach enhances Transformer performance across multiple datasets.

03

Code implementation is publicly available at the provided GitHub repository.

Abstract

Recent advancements in foundation models have been successfully extended to the time series (TS) domain, facilitated by the emergence of large-scale TS datasets. However, previous efforts have primarily Capturing channel dependency (CD) is essential for modeling multivariate time series (TS), and attention-based methods have been widely employed for this purpose. Nonetheless, these methods primarily focus on modifying the architecture, often neglecting the importance of dataset-specific characteristics. In this work, we introduce the concept of partial channel dependence (PCD) to enhance CD modeling in Transformer-based models by leveraging dataset-specific information to refine the CD captured by the model. To achieve PCD, we propose channel masks (CMs), which are integrated into the attention matrices of Transformers via element-wise multiplication. CMs consist of two components: 1) a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YonseiML/pcd
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.