CLR-voyance: Reinforcing Open-Ended Reasoning for Inpatient Clinical Decision Support with Outcome-Aware Rubrics
Aishik Nagar,Arun-Kumar Kaliya-Perumal,Yu-Hsuan Han,Andrew Sheng-Han Huang,Kristen Kee,Yushi Cao,Yiming Chen,Hongchao Jiang

TL;DR
CLR-voyance reformulates inpatient clinical reasoning as a POMDP, using outcome-aware rubrics supervised by clinicians to improve reasoning accuracy and evaluation, achieving state-of-the-art results and real-world deployment.
Contribution
It introduces a novel framework that combines outcome-grounded rewards with clinician-validated rubrics for inpatient reasoning, enhancing model performance and interpretability.
Findings
CLR-voyance-8B achieves 84.91% on CLR-POMDP, outperforming GPT-5 and MedGemma-27B.
Models trained with CLR-voyance show state-of-the-art reasoning capabilities.
Clinician studies validate the clinical relevance and effectiveness of the approach.
Abstract
Inpatient clinical reasoning is a sequential decision under partial observability: the clinician sees the admission so far and must choose the next action whose downstream consequences are not yet visible. Existing clinical-LLM evaluations and RL rewards signals collapse this into closed-form retrieval, clinical journey leakage, or unanchored LLM-as-judge scoring. We introduce CLR-voyance, a framework that reformulates inpatient reasoning as a Partially Observable Markov Decision Process (POMDP) and supervises it with rewards that are simultaneously outcome-grounded and clinician-validated. We instantiate the formulation as CLR-POMDP, which partitions successful patient journeys into a policy-visible past and an oracle-only future. Using the past information, an oracle LLM generates a case-specific query-answer pair, and the first adaptive rubric for clinical reasoning which is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
