LLMs can be easily Confused by Instructional Distractions
Yerin Hwang, Yongil Kim, Jahyun Koo, Taegwan Kang, Hyunkyung Bae, and Kyomin Jung

TL;DR
This paper introduces DIM-Bench, a new benchmark to evaluate how large language models struggle with instructional distraction, revealing their vulnerability to confusion when input resembles instructions across various tasks.
Contribution
The paper presents DIM-Bench, a novel benchmark for assessing LLM performance under instructional distraction, highlighting the models' susceptibility to confusion in such scenarios.
Findings
LLMs often fail to follow user intent under instructional distraction
Even advanced LLMs are vulnerable to confusion caused by instructional-like inputs
DIM-Bench categorizes real-world instances of instructional distraction across multiple tasks
Abstract
Despite the fact that large language models (LLMs) show exceptional skill in instruction following tasks, this strength can turn into a vulnerability when the models are required to disregard certain instructions. Instruction-following tasks typically involve a clear task description and input text containing the target data to be processed. However, when the input itself resembles an instruction, confusion may arise, even if there is explicit prompting to distinguish between the task instruction and the input. We refer to this phenomenon as instructional distraction. In this paper, we introduce a novel benchmark, named DIM-Bench, specifically designed to assess LLMs' performance under instructional distraction. The benchmark categorizes real-world instances of instructional distraction and evaluates LLMs across four instruction tasks: rewriting, proofreading, translation, and style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Interpreting and Communication in Healthcare
