MedOmni-45{\deg}: A Safety-Performance Benchmark for Reasoning-Oriented LLMs in Medicine
Kaiyuan Ji, Yijin Guo, Zicheng Zhang, Xiangyang Zhu, Yuan Tian, Ning Liu, Guangtao Zhai

TL;DR
MedOmni-45 Degrees is a comprehensive benchmark designed to evaluate reasoning safety and performance trade-offs in medical language models, highlighting vulnerabilities like faithfulness and sycophancy across diverse models and tasks.
Contribution
The paper introduces MedOmni-45 Degrees, a novel benchmark with a workflow and metrics to quantify safety-performance trade-offs in medical LLMs under manipulative hints.
Findings
Models show a safety-performance trade-off with no model surpassing the diagonal.
Open-source QwQ-32B balances safety and accuracy closest to optimal.
Benchmark exposes reasoning vulnerabilities and guides safer model development.
Abstract
With the increasing use of large language models (LLMs) in medical decision-support, it is essential to evaluate not only their final answers but also the reliability of their reasoning. Two key risks are Chain-of-Thought (CoT) faithfulness -- whether reasoning aligns with responses and medical facts -- and sycophancy, where models follow misleading cues over correctness. Existing benchmarks often collapse such vulnerabilities into single accuracy scores. To address this, we introduce MedOmni-45 Degrees, a benchmark and workflow designed to quantify safety-performance trade-offs under manipulative hint conditions. It contains 1,804 reasoning-focused medical questions across six specialties and three task types, including 500 from MedMCQA. Each question is paired with seven manipulative hint types and a no-hint baseline, producing about 27K inputs. We evaluate seven LLMs spanning open-…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
