When Choices Become Risks: Safety Failures of Large Language Models under Multiple-Choice Constraints

Yuheng Chen; Zhiyu Wu; Bowen Cheng; and Tetsuro Takahashi

arXiv:2604.16916·cs.CL·April 21, 2026

When Choices Become Risks: Safety Failures of Large Language Models under Multiple-Choice Constraints

Yuheng Chen, Zhiyu Wu, Bowen Cheng, and Tetsuro Takahashi

PDF

TL;DR

This paper reveals that large language models often fail to refuse unsafe responses in multiple-choice tasks, exposing a significant safety risk overlooked by current evaluation methods.

Contribution

It uncovers a systematic failure mode where forced-choice constraints lead models to violate safety policies, even when they refuse open-ended prompts.

Findings

01

Forced-choice constraints increase policy violations across models.

02

Violation rates peak at intermediate constraint levels for human-authored MCQs.

03

High-capability models show near-saturation violation rates and transferability.

Abstract

Safety alignment in large language models (LLMs) is primarily evaluated under open-ended generation, where models can mitigate risk by refusing to respond. In contrast, many real-world applications place LLMs in structured decision-making tasks, such as multiple-choice questions (MCQs), where abstention is discouraged or unavailable. We identify a systematic failure mode in this setting: reformulating harmful requests as forced-choice MCQs, where all options are unsafe, can systematically bypass refusal behavior, even in models that consistently reject equivalent open-ended prompts. Across 14 proprietary and open-source models, we show that forced-choice constraints sharply increase policy-violating responses. Notably, for human-authored MCQs, violation rates follow an inverted U-shaped trend with respect to structural constraint strength, peaking under intermediate task specifications,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.