Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training
Ahmed M. Adly, Mostafa Samy, Amr Fawzy

TL;DR
Gazal-R1 is a 32-billion-parameter medical reasoning model that outperforms larger models through a novel two-stage training process emphasizing structured reasoning and explainability.
Contribution
The paper introduces a new two-stage training pipeline with parameter-efficient techniques and reinforcement learning to enhance medical reasoning in mid-sized language models.
Findings
Achieves state-of-the-art scores on multiple medical benchmarks.
Outperforms larger models by up to 12x in medical reasoning tasks.
Provides transparent, step-by-step clinical explanations.
Abstract
We present Gazal-R1, a 32-billion-parameter language model that achieves state-of-the-art performance in medical reasoning while providing transparent, step-by-step explanations for clinical decision-making. Built upon Qwen3 32B, our model demonstrates that strategic training can enable mid-sized models to outperform significantly larger counterparts in specialized domains. We developed a novel two-stage training pipeline: first, supervised fine-tuning on a carefully curated dataset of 107,033 synthetic medical reasoning examples that teaches structured clinical thinking, enhanced by advanced parameter-efficient techniques including Weight-Decomposed Low-Rank Adaptation (DoRA) and Rank-Stabilized LoRA (rsLoRA); second, reinforcement learning using Group Relative Policy Optimization (GRPO) with a sophisticated multi-component reward system that refines accuracy, format adherence, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
