Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models

Yeonjun In; Wonjoong Kim; Kanghoon Yoon; Sungchul Kim; Mehrab Tanjim; Sangwu Park; Kibum Kim; Chanyoung Park

arXiv:2502.15086·cs.CL·October 24, 2025

Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models

Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park

PDF

Open Access 1 Repo 2 Datasets 1 Video

TL;DR

This paper introduces U-SafeBench, a benchmark for evaluating large language models' safety tailored to individual user standards, revealing current models' shortcomings and proposing a chain-of-thought remedy to enhance safety.

Contribution

The paper presents the first benchmark for user-specific safety evaluation of LLMs and demonstrates the models' failure to meet personalized safety standards, along with a simple improvement method.

Findings

01

Current LLMs fail to meet user-specific safety standards.

02

U-SafeBench effectively evaluates user-specific safety.

03

Chain-of-thought improves safety performance.

Abstract

As the use of large language model (LLM) agents continues to grow, their safety vulnerabilities have become increasingly evident. Extensive benchmarks evaluate various aspects of LLM safety by defining the safety relying heavily on general standards, overlooking user-specific standards. However, safety standards for LLM may vary based on a user-specific profiles rather than being universally consistent across all users. This raises a critical research question: Do LLM agents act safely when considering user-specific safety standards? Despite its importance for safe LLM use, no benchmark datasets currently exist to evaluate the user-specific safety of LLMs. To address this gap, we introduce U-SafeBench, a benchmark designed to assess user-specific aspect of LLM safety. Our evaluation of 20 widely used LLMs reveals current LLMs fail to act safely when considering user-specific safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yeonjun-in/u-safebench
noneOfficial

Datasets

Videos

Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models· underline

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI