AstroVLM: Expert Multi-agent Collaborative Reasoning for Astronomical Imaging Quality Diagnosis

Yaohui Han; Tianshuo Wang; Zixi Zhao; Zhengchun Zhu; Shuo Ren; Yiru Wang; Rongliang Fu; Tinghuan Chen; Tsung-Yi Ho

arXiv:2604.16024·cs.MA·April 20, 2026

AstroVLM: Expert Multi-agent Collaborative Reasoning for Astronomical Imaging Quality Diagnosis

Yaohui Han, Tianshuo Wang, Zixi Zhao, Zhengchun Zhu, Shuo Ren, Yiru Wang, Rongliang Fu, Tinghuan Chen, Tsung-Yi Ho

PDF

TL;DR

AstroVLM is a multi-agent system leveraging vision-language models to diagnose and localize errors in complex astronomical images, outperforming existing methods.

Contribution

The paper introduces AstroVLM, a novel collaborative multi-agent framework specifically designed for astronomical image quality diagnosis, addressing complex multi-process challenges.

Findings

01

AstroVLM outperforms all baseline methods on real-world tasks.

02

The system effectively handles complex correlations in astronomical imaging.

03

Provides a reference for language models in multi-process problem solving.

Abstract

Vision Language Models (VLMs) have been applied to several specific domains and have shown strong problem-solving capabilities. However, astronomical imaging, a quite complex problem involving multidisciplinary knowledge and several subtasks, has not been adequately studied. Due to the complexity of the astronomical imaging process, both world-class astronomical organizations, such as NASA, and expert enthusiasts devote a great deal of time and effort. This is because the processes in astronomical imaging have complex underlying correlations that significantly influence one another, making the quality diagnosis and error localization of astronomical images challenging. To address this problem, we propose AstroVLM, a collaborative multi-agent system for diagnosing the quality of astronomical images. Experiment results show that AstroVLM outperforms all baselines on real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.