What Prompts Don't Say: Understanding and Managing Underspecification in LLM Prompts
Chenyang Yang, Yike Shi, Qianou Ma, Michael Xieyang Liu, Christian K\"astner, Tongshuang Wu

TL;DR
This paper analyzes prompt underspecification in LLMs, revealing its fragility and proposing requirements-aware optimization and systematic management to improve reliability.
Contribution
It introduces requirements-aware prompt optimization methods and advocates for proactive discovery and monitoring of requirements to address underspecification issues.
Findings
LLMs infer unspecified requirements 41.1% of the time
Underspecified prompts are twice as likely to regress with model or prompt changes
Requirements-aware optimization improves performance by 4.8% on average
Abstract
Prompt underspecification is a common challenge when interacting with LLMs. In this paper, we present an in-depth analysis of this problem, showing that while LLMs can often infer unspecified requirements by default (41.1%), such behavior is fragile: Under-specified prompts are 2x as likely to regress across model or prompt changes, sometimes with accuracy drops exceeding 20%. This instability makes it difficult to reliably build LLM applications. Moreover, simply specifying all requirements does not consistently help, as models have limited instruction-following ability and requirements can conflict. Standard prompt optimizers likewise provide little benefit. To address these issues, we propose requirements-aware prompt optimization mechanisms that improve performance by 4.8% on average over baselines. We further advocate for a systematic process of proactive requirements discovery,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
