Multi-Modal Time Series Prediction via Mixture of Modulated Experts
Lige Zhang, Ali Maatouk, Jialin Chen, Leandros Tassiulas, Rex Ying

TL;DR
This paper introduces Expert Modulation, a novel multi-modal time series prediction approach that uses textual signals to condition expert routing and computation, leading to improved forecasting accuracy.
Contribution
It proposes a new paradigm for multi-modal time series forecasting that conditions expert behavior on textual signals, addressing limitations of token-level fusion methods.
Findings
Significant performance improvements over existing methods.
Effective cross-modal control over expert behavior.
Theoretical analysis supports the proposed approach.
Abstract
Real-world time series exhibit complex and evolving dynamics, making accurate forecasting extremely challenging. Recent multi-modal forecasting methods leverage textual information such as news reports to improve prediction, but most rely on token-level fusion that mixes temporal patches with language tokens in a shared embedding space. However, such fusion can be ill-suited when high-quality time-text pairs are scarce and when time series exhibit substantial variation in scale and characteristics, thus complicating cross-modal alignment. In parallel, Mixture-of-Experts (MoE) architectures have proven effective for both time series modeling and multi-modal learning, yet many existing MoE-based modality integration methods still depend on token-level fusion. To address this, we propose Expert Modulation, a new paradigm for multi-modal time series prediction that conditions both routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Forecasting Techniques and Applications · Time Series Analysis and Forecasting
