Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data

Xinlin Zhuang; Feilong Tang; Haolin Yang; Xiwei Liu; Ming Hu; Huifa Li; Haochen Xue; Junjun He; Zongyuan Ge; Yichen Li; Ying Qian; Imran Razzak

arXiv:2508.01450·cs.CL·March 17, 2026

Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data

Xinlin Zhuang, Feilong Tang, Haolin Yang, Xiwei Liu, Ming Hu, Huifa Li, Haochen Xue, Junjun He, Zongyuan Ge, Yichen Li, Ying Qian, Imran Razzak

PDF

Open Access

TL;DR

This paper introduces DIQ, a data selection method that improves medical reasoning in vision-language models by choosing high-quality, challenging samples, enabling high performance with minimal fine-tuning data.

Contribution

The paper proposes the DIQ strategy that balances difficulty and influence for selecting training data, significantly reducing data requirements while enhancing reasoning quality.

Findings

01

DIQ-selected data enables models to match full dataset performance with only 1% data.

02

Using 10% of DIQ-selected data outperforms baseline methods across benchmarks.

03

DIQ improves the alignment of model reasoning with expert clinical practices.

Abstract

Supervised Fine-Tuning (SFT) of the language backbone plays a pivotal role in adapting Vision-Language Models (VLMs) to specialized domains such as medical reasoning. However, existing SFT practices often rely on unfiltered textual datasets that contain redundant and low-quality samples, leading to substantial computational costs and suboptimal performance in complex clinical scenarios. Although existing methods attempt to alleviate this problem by selecting data based on sample difficulty, defined by knowledge and reasoning complexity, they overlook each sample's optimization utility reflected in its gradient. Interestingly, we find that gradient-based influence alone favors easy-to-optimize samples that cause large parameter shifts but lack deep reasoning chains, while difficulty alone selects noisy or overly complex textual cases that fail to guide stable optimization. Based on this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Topic Modeling