Multi-Scale Wavelet Transformers for Operator Learning of Dynamical Systems

Xuesong Wang; Michael Groom; Rafael Oliveira; He Zhao; Terence O'Kane; Edwin V. Bonilla

arXiv:2602.01486·cs.LG·May 7, 2026

Multi-Scale Wavelet Transformers for Operator Learning of Dynamical Systems

Xuesong Wang, Michael Groom, Rafael Oliveira, He Zhao, Terence O'Kane, Edwin V. Bonilla

PDF

TL;DR

This paper introduces multi-scale wavelet transformers (MSWTs) that learn dynamical system behaviors in a wavelet domain, effectively capturing high-frequency details and improving long-term forecasting accuracy.

Contribution

The paper proposes MSWTs, a novel model that uses wavelet transforms and attention mechanisms to better represent multi-scale features in dynamical systems.

Findings

01

MSWTs significantly reduce errors in chaotic system predictions.

02

MSWTs improve spectral fidelity in long-term forecasts.

03

MSWTs decrease climatological bias in climate reanalysis data.

Abstract

Recent years have seen a surge in data-driven surrogates for dynamical systems that can be orders of magnitude faster than numerical solvers. However, many machine learning-based models such as neural operators exhibit spectral bias, attenuating high-frequency components that often encode small-scale structure. This limitation is particularly damaging in applications such as weather forecasting, where misrepresented high frequencies can induce long-horizon instability. To address this issue, we propose multi-scale wavelet transformers (MSWTs), which learn system dynamics in a tokenized wavelet domain. The wavelet transform explicitly separates low- and high-frequency content across scales. MSWTs leverage a wavelet-preserving downsampling scheme that retains high-frequency features and employ wavelet-based attention to capture dependencies across scales and frequency bands. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.