RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints

Tan-Hanh Pham; Chris Ngo

arXiv:2506.06600·cs.CV·June 17, 2025

RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints

Tan-Hanh Pham, Chris Ngo

PDF

Open Access

TL;DR

This paper introduces RARL, a reinforcement learning framework that enhances medical vision-language models' reasoning and generalization capabilities while being efficient enough for low-resource environments.

Contribution

RARL is a novel approach that fine-tunes lightweight medical VLMs with reinforcement learning and LoRA, improving reasoning and generalization under data and hardware constraints.

Findings

01

RARL outperforms supervised fine-tuning by approximately 7.78% in reasoning tasks.

02

The approach achieves around 27% better performance on unseen datasets.

03

Training on a single GPU demonstrates its efficiency and practicality.

Abstract

The growing integration of vision-language models (VLMs) in medical applications offers promising support for diagnostic reasoning. However, current medical VLMs often face limitations in generalization, transparency, and computational efficiency-barriers that hinder deployment in real-world, resource-constrained settings. To address these challenges, we propose a Reasoning-Aware Reinforcement Learning framework, \textbf{RARL}, that enhances the reasoning capabilities of medical VLMs while remaining efficient and adaptable to low-resource environments. Our approach fine-tunes a lightweight base model, Qwen2-VL-2B-Instruct, using Low-Rank Adaptation and custom reward functions that jointly consider diagnostic accuracy and reasoning quality. Training is performed on a single NVIDIA A100-PCIE-40GB GPU, demonstrating the feasibility of deploying such models in constrained environments. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare