Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models
Hyeonseok Moon, Jaehyung Seo, Seungyoon Lee, Chanjun Park, Heuiseok, Lim

TL;DR
This paper introduces the IoInst benchmark to evaluate large language models' ability to understand instructions without distraction, revealing that even recent models still struggle with instruction comprehension.
Contribution
The paper presents the IoInst benchmark, a new evaluation tool specifically designed to assess instruction understanding in LLMs beyond simple instruction-following.
Findings
State-of-the-art models still lack robust instruction understanding.
IoInst effectively identifies models' ability to focus on relevant instructions.
Analysis of strategies to improve instruction comprehension.
Abstract
One of the key strengths of Large Language Models (LLMs) is their ability to interact with humans by generating appropriate responses to given instructions. This ability, known as instruction-following capability, has established a foundation for the use of LLMs across various fields and serves as a crucial metric for evaluating their performance. While numerous evaluation benchmarks have been developed, most focus solely on clear and coherent instructions. However, we have noted that LLMs can become easily distracted by instruction-formatted statements, which may lead to an oversight of their instruction comprehension skills. To address this issue, we introduce the Intention of Instruction (IoInst) benchmark. This benchmark evaluates LLMs' capacity to remain focused and understand instructions without being misled by extraneous instructions. The primary objective of this benchmark is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Intelligent Tutoring Systems and Adaptive Learning
MethodsFocus
