Exploring Multimodal AI Reasoning for Meteorological Forecasting from Skew-T Diagrams
ChangJae Lee, Heecheol Yang, Jonghak Choi

TL;DR
This paper develops a lightweight multimodal AI system that interprets Skew-T diagrams for meteorological forecasting, demonstrating comparable performance to operational models using visual reasoning and small models.
Contribution
It introduces a novel curriculum learning framework for training small vision-language models to interpret atmospheric diagrams for weather prediction.
Findings
VLM achieves performance comparable to NWP models.
Visual grounding and reasoning supervision are crucial.
Attention maps show focus on relevant meteorological features.
Abstract
Forecasting from atmospheric soundings is a fundamental task in operational meteorology, often requiring structured visual reasoning over Skew-T log-P diagrams by human forecasters. While recent advances in Vision-Language Models (VLMs) have shown promise in other scientific domains, their application to meteorological diagram interpretation remains largely unexplored. In this study, we present a lightweight AI assistant that interprets Skew-T diagrams using a small language model (LM) and a small VLM fine-tuned to emulate human forecasters. Using a curriculum learning framework, we first train the models to identify key atmospheric features from diagrams through visual question answering, followed by chain-of-thought reasoning tasks that estimate precipitation probability based on the derived visual groundings. Model inputs include either textual summaries or generated Skew-T diagrams…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Mining Algorithms and Applications · Data Management and Algorithms
