TriTS: Time Series Forecasting from a Multimodal Perspective
Xiang Ao

TL;DR
TriTS introduces a multimodal framework for long-term time series forecasting, leveraging time, frequency, and visual modalities to improve accuracy and efficiency over existing methods.
Contribution
The paper proposes TriTS, a novel cross-modal disentanglement framework that combines time, frequency, and visual representations for enhanced long-term forecasting.
Findings
TriTS achieves state-of-the-art performance on multiple benchmarks.
It significantly reduces parameters and inference latency compared to vision-based forecasters.
The approach effectively captures complex temporal dynamics through multimodal fusion.
Abstract
Time series forecasting plays a pivotal role in critical sectors such as finance, energy, transportation, and meteorology. However, Long-term Time Series Forecasting (LTSF) remains a significant challenge because real-world signals contain highly entangled temporal dynamics that are difficult to fully capture from a purely 1D perspective. To break this representation bottleneck, we propose TriTS, a novel cross-modal disentanglement framework that projects 1D time series into orthogonal time, frequency, and 2D-vision spaces.To seamlessly bridge the 1D-to-2D modality gap without the prohibitive computational overhead of Vision Transformers (ViTs), we introduce a Period-Aware Reshaping strategy and incorporate Visual Mamba (Vim). This approach efficiently models cross-period dependencies as global visual textures while maintaining linear computational complexity. Complementing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
