Reason2Decide: Rationale-Driven Multi-Task Learning

H M Quamran Hasan; Housam Khalifa Bashier; Jiayi Dai; Mi-Young Kim; Randy Goebel

arXiv:2512.20074·cs.AI·May 7, 2026

Reason2Decide: Rationale-Driven Multi-Task Learning

H M Quamran Hasan, Housam Khalifa Bashier, Jiayi Dai, Mi-Young Kim, Randy Goebel

PDF

TL;DR

Reason2Decide is a two-stage training framework for clinical decision support that improves prediction accuracy and explanation quality, using LLM-generated rationales and reducing reliance on human annotations.

Contribution

It introduces a novel two-stage training method with scheduled sampling to enhance rationale alignment and performance in multi-task learning for clinical NLP tasks.

Findings

01

Outperforms fine-tuning baselines and some zero-shot LLMs in prediction and rationale fidelity.

02

Achieves robustness across different rationale sources, including LLM-generated and nurse-authored.

03

Operates effectively with models 40x smaller than large foundation models.

Abstract

Despite the wide adoption of Large Language Models (LLM)s, clinical decision support systems face a critical challenge: achieving high predictive accuracy while generating explanations aligned with the predictions. Current approaches suffer from exposure bias leading to misaligned explanations. We propose Reason2Decide, a two-stage training framework that addresses key challenges in self-rationalization, including exposure bias and task separation. In Stage-1, our model is trained on rationale generation, while in Stage-2, we jointly train on label prediction and rationale generation, applying scheduled sampling to gradually transition from conditioning on gold labels to model predictions. We evaluate Reason2Decide on three medical datasets, including a proprietary triage dataset and public biomedical QA datasets. Across model sizes, Reason2Decide outperforms other fine-tuning baselines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.