Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

Ahmed M. Adly; Mostafa Samy; Amr Fawzy

arXiv:2506.21594·cs.CL·June 30, 2025

Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

Ahmed M. Adly, Mostafa Samy, Amr Fawzy

PDF

Open Access 1 Models 2 Datasets

TL;DR

Gazal-R1 is a 32-billion-parameter medical reasoning model that outperforms larger models through a novel two-stage training process emphasizing structured reasoning and explainability.

Contribution

The paper introduces a new two-stage training pipeline with parameter-efficient techniques and reinforcement learning to enhance medical reasoning in mid-sized language models.

Findings

01

Achieves state-of-the-art scores on multiple medical benchmarks.

02

Outperforms larger models by up to 12x in medical reasoning tasks.

03

Provides transparent, step-by-step clinical explanations.

Abstract

We present Gazal-R1, a 32-billion-parameter language model that achieves state-of-the-art performance in medical reasoning while providing transparent, step-by-step explanations for clinical decision-making. Built upon Qwen3 32B, our model demonstrates that strategic training can enable mid-sized models to outperform significantly larger counterparts in specialized domains. We developed a novel two-stage training pipeline: first, supervised fine-tuning on a carefully curated dataset of 107,033 synthetic medical reasoning examples that teaches structured clinical thinking, enhanced by advanced parameter-efficient techniques including Weight-Decomposed Low-Rank Adaptation (DoRA) and Rank-Stabilized LoRA (rsLoRA); second, reinforcement learning using Group Relative Policy Optimization (GRPO) with a sophisticated multi-component reward system that refines accuracy, format adherence, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
TachyHealth/Gazal-R1-32B-GRPO-preview
model· 7 dl· ♡ 1
7 dl♡ 1

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)