CoRA: Boosting Time Series Foundation Models for Multivariate Forecasting through Correlation-aware Adapter
Hanyin Cheng, Xingjian Wu, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

TL;DR
This paper introduces CoRA, a lightweight, correlation-aware adapter for Time Series Foundation Models that captures dynamic and static inter-channel correlations to significantly enhance multivariate forecasting accuracy.
Contribution
The paper presents CoRA, a novel, plug-and-play adapter that decomposes correlation matrices and employs dual contrastive learning to improve multivariate time series forecasting.
Findings
CoRA improves forecasting accuracy across 10 real-world datasets.
Decomposition of correlation matrices reduces model complexity.
Dual contrastive learning effectively captures positive and negative correlations.
Abstract
Most existing Time Series Foundation Models (TSFMs) use channel independent modeling and focus on capturing and generalizing temporal dependencies, while neglecting the correlations among channels or overlooking the different aspects of correlations. However, these correlations play a vital role in Multivariate time series forecasting. To address this, we propose a CoRrelation-aware Adapter (CoRA), a lightweight plug-and-play method that requires only fine-tuning with TSFMs and is able to capture different types of correlations, so as to improve forecast performance. Specifically, to reduce complexity, we innovatively decompose the correlation matrix into low-rank Time-Varying and Time-Invariant components. For the Time-Varying component, we further design learnable polynomials to learn dynamic correlations by capturing trends or periodic patterns. To learn positive and negative…
Peer Reviews
Decision·ICLR 2026 Poster
This paper proposes CoRA, a novel correlation-aware adapter designed to enhance the multivariate forecasting capabilities of Time Series Foundation Models (TSFMs) on specific downstream tasks. 1. Originality: The paper is the first to propose a unified framework that simultaneously addresses the dynamic, heterogeneous, and partial aspects of inter-channel correlations. 2. Quality: The paper is of high quality, featuring a rigorous methodology, clear theoretical derivations, and practical efficie
This paper could be improved in the following areas: 1. Lack of Direct Experimental Validation for Partial Correlation (PCorr): The paper claims to model three types of correlations: DCorr, HCorr, and PCorr. While experiments provide strong evidence for DCorr (e.g., Fig. 7) and HCorr (separation of positive/negative spaces), the validation for PCorr is less direct. The ablation study (Table 2) shows the overall benefit of the HPCL module but does not disentangle the specific contributions of mod
- I appreciate that the paper is trying to separately handle different types of correlation (DCorr, HCorr, PCorr) - I find the idea of using a time-varying and time-invariant decomposition to be valuable, along with the idea of finding pairs of features with strong enough positive or negative correlation - The experimental results appear to be very impressive (although as pointed out in weaknesses, I would like to see more baselines and some additional experiments)
- At least in how it is stated now, it's hard for me to see why Theorem 1 should hold since equation (15) in Theorem 1 disagrees with equation (5) and the decomposition showed earlier in Figure 3. - In stating theoretical guarantees, I would suggest also providing intuition for why the guarantee should hold (this intuition should be in the main paper and not in an appendix/supplemental material as it helps the reader understand/interpret the guarantee) and whether the proof uses any nontrivial i
1. The paper is well-motivated, addressing the critical challenge of modeling complex inter-channel dependencies in multivariate time series. 2. The manuscript is well-written, featuring clear and consistent notation throughout. 3. The empirical evaluation demonstrates that CoRA consistently surpasses state-of-the-art baselines across a diverse set of real-world datasets.
1. The projection layers are not much different from the related modeling methods in existing methods (such as TSMixer [1]), a more detailed clarification is needed. 2. The set of baselines for comparison is not comprehensive, as it omits recent state-of-the-art foundation models like TimerXL [2]. 3. Discuss and comparison with classical baselines (e.g., PatchTST [3], Leddam [4], iTransformer [5], Autoformer [6], DLinear [7]) is suggested. *[1] TSMixer: An All-MLP Architecture for Time Series
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Forecasting Techniques and Applications · Traffic Prediction and Management Techniques
