A SSM is Polymerized from Multivariate Time Series
Haixiang Wu

TL;DR
This paper introduces Poly-Mamba, a novel multivariate time series forecasting method that explicitly models complex channel dependencies using an expanded orthogonal function basis, outperforming state-of-the-art methods on real datasets.
Contribution
The paper develops Poly-Mamba, a new approach that explicitly captures channel dependency variations in multivariate time series using an expanded orthogonal basis and novel approximation techniques.
Findings
Poly-Mamba outperforms SOTA methods on six real-world datasets.
It effectively models complex channel dependencies in multivariate time series.
The method is especially effective with datasets having many channels and complex correlations.
Abstract
For multivariate time series (MTS) tasks, previous state space models (SSMs) followed the modeling paradigm of Transformer-based methods. However, none of them explicitly model the complex dependencies of MTS: the Channel Dependency variations with Time (CDT). In view of this, we delve into the derivation of SSM, which involves approximating continuously updated functions by orthogonal function basis. We then develop Poly-Mamba, a novel method for MTS forecasting. Its core concept is to expand the original orthogonal function basis space into a multivariate orthogonal function space containing variable mixing terms, and make a projection on this space so as to explicitly describe the CDT by weighted coefficients. In Poly-Mamba, we propose the Multivariate Orthogonal Polynomial Approximation (MOPA) as a simplified implementation of this concept. For the simple linear relationship between…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
The proposed method is meaningful for extending SSM to multivariate time series modeling, especially in the context that most current models including the standard SSM only have univariate modeling capabilities and channel independence performs better.
- MOPA simplifies the multivariate space mapping process to reduce complexity, but lacks theoretical analysis and approximation errors, which may limit the effectiveness of the proposed method. - The writing and charts are quite crude and need to be improved.
The problem of modeling high-dimensional, multivariate time series data with dynamic correlations is an important, interesting, and long-standing problem with many, many prior works.
Unfortunately, due to several issues outlined below, this paper is not acceptable for publications in its current form. Major issues: -Most critically, while there may be a preponderance of ‘better performance’ of their models compared to other models, the results are so small as to make the difference practically meaningless. Take, for example, the comparisons made in Table 1 (main results) on the Weather data, which the authors themselves highlight in the results as an example of particular
The motivation behind the proposed method is meaningful, as the standard SSM is limited to the projection and reconstruction of univariate functions and lacks the ability to model relationships between multivariate time series. Extending the SSM to a multivariate orthogonal polynomial projection provides an **elegant solution**.
* **W1**: Firstly, the writing of this article needs improvement. The overall impression of this article is rather mediocre, with numerous lines consisting of only one or two words and substantial white spaces. The model diagram and experimental figures are also quite rudimentary. Furthermore, I recommend relocating large tables such as Table 1 & 2 to the appendix, including only summarized results in the main body of the text. * **W2**: Attention should be paid to certain details. For instance
1. This paper presents a well-motivated and effective approach for capturing inter-feature dependencies in state-space models, addressing a gap where many existing models fail to explicitly model these relationships.
1. Although the proposed method is grounded in certain theoretical results, I find it challenging for the method to effectively model inter-feature relationships. In MOPA, a weight matrix is element-wise multiplied by coefficients, meaning it cannot explicitly capture inter-feature dependencies. While LCM does capture these relationships, its architectural novelty seems rather limited. 2. Could you explain how MOPA approximates Equation 8? The formulations appear quite different, so a detailed e
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications
MethodsMatching The Statements
