HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

Shijie Wang; Yilun Zhang; Zeyu Lai; and Dexing Kong

arXiv:2506.07837·cs.AI·June 10, 2025

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

Shijie Wang, Yilun Zhang, Zeyu Lai, and Dexing Kong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new dataset and model for medical ultrasound understanding, bridging the gap in domain-specific multimodal language models by generating specialized data and fine-tuning a large model.

Contribution

It proposes a novel data generation pipeline and creates the ReMUD dataset, enabling effective fine-tuning of MLLMs for medical ultrasound domain tasks.

Findings

01

ReMUD dataset contains over 45,000 QA and VQA samples.

02

ReMUD-7B outperforms general-domain MLLMs in ultrasound tasks.

03

Data and model resources will be publicly released.

Abstract

Multimodal large language models (MLLMs) have shown great potential in general domains but perform poorly in some specific domains due to a lack of domain-specific data, such as image-text data or vedio-text data. In some specific domains, there is abundant graphic and textual data scattered around, but lacks standardized arrangement. In the field of medical ultrasound, there are ultrasonic diagnostic books, ultrasonic clinical guidelines, ultrasonic diagnostic reports, and so on. However, these ultrasonic materials are often saved in the forms of PDF, images, etc., and cannot be directly used for the training of MLLMs. This paper proposes a novel image-text reasoning supervised fine-tuning data generation pipeline to create specific domain quadruplets (image, question, thinking trace, and answer) from domain-specific materials. A medical ultrasound domain dataset ReMUD is established,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shidaizi/remud
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning