Spectral-Aware Text-to-Time Series Generation with Billion-Scale Multimodal Meteorological Data

Shijie Zhang

arXiv:2603.27135·cs.LG·March 31, 2026

Spectral-Aware Text-to-Time Series Generation with Billion-Scale Multimodal Meteorological Data

Shijie Zhang

PDF

TL;DR

This paper introduces a new large-scale meteorological dataset and a spectral-aware diffusion model for text-guided weather time-series generation, achieving state-of-the-art results and strong semantic control.

Contribution

The work presents MeteoCap-3B, a billion-scale weather dataset with expert captions, and MTransformer, a spectral-aware diffusion model for precise text-to-weather time-series synthesis.

Findings

01

State-of-the-art generation quality on real-world benchmarks

02

Accurate cross-modal alignment between text and weather signals

03

Enhanced semantic controllability and improved forecasting in data-sparse scenarios

Abstract

Text-to-time-series generation is particularly important in meteorology, where natural language offers intuitive control over complex, multi-scale atmospheric dynamics. Existing approaches are constrained by the lack of large-scale, physically grounded multimodal datasets and by architectures that overlook the spectral-temporal structure of weather signals. We address these challenges with a unified framework for text-guided meteorological time-series generation. First, we introduce MeteoCap-3B, a billion-scale weather dataset paired with expert-level captions constructed via a Multi-agent Collaborative Captioning (MACC) pipeline, yielding information-dense and physically consistent annotations. Building on this dataset, we propose MTransformer, a diffusion-based model that enables precise semantic control by mapping textual descriptions into multi-band spectral priors through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.