Health-ORSC-Bench: A Benchmark for Measuring Over-Refusal and Safety Completion in Health Context

Zhihao Zhang; Liting Huang; Guanghao Wu; Preslav Nakov; Heng Ji; Usman Naseem

arXiv:2601.17642·cs.AI·January 27, 2026

Health-ORSC-Bench: A Benchmark for Measuring Over-Refusal and Safety Completion in Health Context

Zhihao Zhang, Liting Huang, Guanghao Wu, Preslav Nakov, Heng Ji, Usman Naseem

PDF

Open Access

TL;DR

This paper introduces Health-ORSC-Bench, a comprehensive benchmark to evaluate healthcare language models on over-refusal and safe completion, revealing challenges in balancing safety and helpfulness across different model sizes and types.

Contribution

It presents the first large-scale benchmark for measuring over-refusal and safe completion in healthcare LLMs, with an automated pipeline and human validation, and evaluates 30 models revealing calibration challenges.

Findings

01

Larger models tend to be more safety-pessimistic and over-refuse benign prompts.

02

Safety-optimized models often refuse up to 80% of hard benign prompts.

03

Model size and family significantly influence safety and utility balance.

Abstract

Safety alignment in Large Language Models is critical for healthcare; however, reliance on binary refusal boundaries often results in \emph{over-refusal} of benign queries or \emph{unsafe compliance} with harmful ones. While existing benchmarks measure these extremes, they fail to evaluate Safe Completion: the model's ability to maximise helpfulness on dual-use or borderline queries by providing safe, high-level guidance without crossing into actionable harm. We introduce \textbf{Health-ORSC-Bench}, the first large-scale benchmark designed to systematically measure \textbf{Over-Refusal} and \textbf{Safe Completion} quality in healthcare. Comprising 31,920 benign boundary prompts across seven health categories (e.g., self-harm, medical misinformation), our framework uses an automated pipeline with human validation to test models at varying levels of intent ambiguity. We evaluate 30…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI