Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models
Hao Wang, Licheng Pan, Yuan Lu, Zhichao Chen, Tianqiao Liu, Shuting He, Zhixuan Chu, Qingsong Wen, Haoxuan Li, Zhouchen Lin

TL;DR
This paper introduces a quadratic-form weighted training objective for multi-step time-series forecasting, which accounts for label autocorrelation and heterogeneous task importance, leading to improved forecasting accuracy.
Contribution
The paper proposes a novel quadratic-form weighted training objective and a Quadratic Direct Forecast (QDF) algorithm that adaptively updates the weighting matrix for better multi-step forecasting.
Findings
QDF improves forecasting accuracy across various models.
The quadratic weighting captures label autocorrelation effectively.
State-of-the-art results achieved on benchmark datasets.
Abstract
The design of training objective is central to training time-series forecasting models. Existing training objectives such as mean squared error mostly treat each future step as an independent, equally weighted task, which we found leading to the following two issues: (1) overlook the label autocorrelation effect among future steps, leading to biased training objective; (2) fail to set heterogeneous task weights for different forecasting tasks corresponding to varying future steps, limiting the forecasting performance. To fill this gap, we propose a novel quadratic-form weighted training objective, addressing both of the issues simultaneously. Specifically, the off-diagonal elements of the weighting matrix account for the label autocorrelation effect, whereas the non-uniform diagonals are expected to match the most preferable weights of the forecasting tasks with varying future steps. To…
Peer Reviews
Decision·ICLR 2026 Poster
1. Overall this paper is clearly written and easy to understand. The technical details are sound and mostly sufficient. 2. The proposed method to improve the training objectives for time series forecasting is novel and applicable to related problems in this domain. 3. The authors provide the source code of their implementation. After reviewing the source code, I did not find major issues. 4. The authors perform rigorous evaluations and helpful ablation studies to understand the impact of d
1. In addition to MAE and MSE, the authors should evaluate their proposed method with MAPE (mean absolute percentage error) which is robust under different scales of the time series values. 2. The authors should also evaluate their proposed method on standard benchmark datasets for time series forecasting, such as the M4 competition dataset.
- The paper introduces a technically novel strategy to solve an empirically and theoretically motivated problem that is much relevant in neural, (direct) non-autocorrelational models for time series forecasting. - Promising empirical evidence of QDF on a comprehensive set of datasets, against transformer and non-transformer based methods, and against a variety of time series optimization objectives. - Ablation studies and the effect on various architectures is provided.
- A theoretical result or comments on the convergence of Algorithms 1 and 2 would make the presentation stronger. - Emphasizing the above point, the proposed method may be too computationally expensive for some models, given the bilevel optimization, over $K$ splits required to perform the optimization to find $\Sigma$. - Some parts of the text are not very clear: For instance, on the results in Section 4.2, which model is used to compare the different forecasting objectives? Similarly in Table
1. The paper improves the learning objective for TSF models, which is an insufficiently explored yet important research problem in time-series forecasting. 2. Experiment results are effectively presented and structured to support the paper's claims. 3. Code is provided which facilitates reproduction.
1. The paper could benefit from a more analytical exploration why the proposed weighting matrix could possibly improve existing formulations, especially FreDF and Time-o1. 2. For time-series datasets, careful design of data splitting strategies is essential to avoid information leakage. It is not very clear whether the risk of leakage is fully bypassed in the hybrid splitting strategy of this paper (e.g., validation, meta-update, etc.). A detailed analysis is needed to analyze the leakage issu
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Stock Market Forecasting Methods · Traffic Prediction and Management Techniques
