SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation
Zhiming Ma, Xiayang Xiao, Sihao Dong, Peidong Wang, HaiPeng Wang,, Qingyun Pan

TL;DR
This paper introduces SARChat-2M, the first large-scale multimodal dataset for SAR image interpretation, enabling the evaluation of vision-language models in the remote sensing domain with diverse scenarios and detailed annotations.
Contribution
It develops a comprehensive multimodal dataset and benchmark for SAR images, facilitating advanced VLM applications in remote sensing.
Findings
Effective evaluation of 16 mainstream VLMs on SARChat-2M
Demonstrated the dataset's utility in SAR image understanding tasks
Provides a framework for multimodal dataset construction in remote sensing
Abstract
As a powerful all-weather Earth observation tool, synthetic aperture radar (SAR) remote sensing enables critical military reconnaissance, maritime surveillance, and infrastructure monitoring. Although Vision language models (VLMs) have made remarkable progress in natural language processing and image understanding, their applications remain limited in professional domains due to insufficient domain expertise. This paper innovatively proposes the first large-scale multimodal dialogue dataset for SAR images, named SARChat-2M, which contains approximately 2 million high-quality image-text pairs, encompasses diverse scenarios with detailed target annotations. This dataset not only supports several key tasks such as visual understanding and object detection tasks, but also has unique innovative aspects: this study develop a visual-language dataset and benchmark for the SAR domain, enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMethane Hydrates and Related Phenomena · Multimodal Machine Learning Applications · Underwater Acoustics Research
