CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Guangzhi Sun; Xiao Zhan; Shutong Feng; Philip C. Woodland; Jose Such

arXiv:2501.14940·cs.CL·February 10, 2025

CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Guangzhi Sun, Xiao Zhan, Shutong Feng, Philip C. Woodland, Jose Such

PDF

Open Access

TL;DR

This paper introduces CASE-Bench, a new context-aware safety benchmark for large language models that considers contextual factors in safety assessments, revealing significant impacts of context on safety judgments and model performance.

Contribution

We present CASE-Bench, a novel safety benchmark incorporating context into evaluations and demonstrate its importance through extensive analysis of various LLMs.

Findings

01

Context significantly affects safety judgments (p<0.0001).

02

Commercial models often mismatch human safety judgments in safe contexts.

03

Large annotator groups improve detection of safety differences.

Abstract

Aligning large language models (LLMs) with human values is essential for their safe deployment and widespread adoption. Current LLM safety benchmarks often focus solely on the refusal of individual problematic queries, which overlooks the importance of the context where the query occurs and may cause undesired refusal of queries under safe contexts that diminish user experience. Addressing this gap, we introduce CASE-Bench, a Context-Aware SafEty Benchmark that integrates context into safety assessments of LLMs. CASE-Bench assigns distinct, formally described contexts to categorized queries based on Contextual Integrity theory. Additionally, in contrast to previous studies which mainly rely on majority voting from just a few annotators, we recruited a sufficient number of annotators necessary to ensure the detection of statistically significant differences among the experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Access Control and Trust

MethodsFocus