JW-VL: A Vision-Language Model for Solar Physics

Mingfu Shao; Hui Wang; Liyue Tong; Yuyang Li; Cunshi Wang; Jiaben Lin; Suo Liu; Haiqing Xu; Yin Zhang; Jing Huang

arXiv:2603.28504·astro-ph.SR·March 31, 2026

JW-VL: A Vision-Language Model for Solar Physics

Mingfu Shao, Hui Wang, Liyue Tong, Yuyang Li, Cunshi Wang, Jiaben Lin, Suo Liu, Haiqing Xu, Yin Zhang, Jing Huang

PDF

TL;DR

JW-VL is a specialized vision-language model tailored for solar physics that integrates multi-wavelength data to enhance solar image analysis and reasoning tasks.

Contribution

The paper introduces JW-VL, a fine-tuned foundation model for solar physics that combines multimodal data and knowledge distillation for improved analysis.

Findings

01

JW-VL enables end-to-end solar data modeling.

02

It supports tasks like image recognition, question answering, and OCR.

03

A solar activity report agent demonstrates interdisciplinary application.

Abstract

Vision-Language Models (VLMs) have achieved breakthrough progress in general knowledge domains, yet adaptation to specialized scientific fields remains challenging due to multimodal representation shifts and the limited integration of domain-specific knowledge. To address the limitations of general-purpose VLMs when applied to solar physics image recognition, analysis, and reasoning, we propose JinWu Vision-Language (JW-VL), a fine-tuned foundation model tailored for solar physics. The model integrates multi-wavelength observational data from both space-based and ground-based telescopes, encompassing representative spectral bands spanning the photosphere, chromosphere, and corona. Built upon a cross-modal alignment knowledge distillation framework, JW-VL learns a joint visual-semantic embedding that enables end-to-end modeling from raw solar observational data to downstream tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.