MoHETS: Long-term Time Series Forecasting with Mixture-of-Heterogeneous-Experts
Evandro S. Ortigossa, Guy Lutsker, Eran Segal

TL;DR
MoHETS introduces a novel Transformer-based approach with heterogeneous experts for long-term multivariate time series forecasting, effectively capturing multi-scale structures and improving accuracy over existing methods.
Contribution
The paper proposes MoHETS, a Transformer model with heterogeneous mixture-of-experts layers, enhancing long-term forecasting by capturing diverse temporal dynamics and improving parameter efficiency.
Findings
Achieves state-of-the-art performance on seven benchmarks.
Reduces average MSE by 12% compared to recent baselines.
Effectively models multi-scale and non-stationary time series.
Abstract
Real-world multivariate time series can exhibit intricate multi-scale structures, including global trends, local periodicities, and non-stationary regimes, which makes long-horizon forecasting challenging. Although sparse Mixture-of-Experts (MoE) approaches improve scalability and specialization, they typically rely on homogeneous MLP experts that poorly capture the diverse temporal dynamics of time series data. We address these limitations with MoHETS, an encoder-only Transformer that integrates sparse Mixture-of-Heterogeneous-Experts (MoHE) layers. MoHE routes temporal patches to a small subset of expert networks, combining a shared depthwise-convolution expert for sequence-level continuity with routed Fourier-based experts for patch-level periodic structures. MoHETS further improves robustness to non-stationary dynamics by incorporating exogenous information via cross-attention over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Forecasting Techniques and Applications · Time Series Analysis and Forecasting
