Small Edits, Big Consequences: Telling Good from Bad Robustness in Large Language Models
Altynbek Ismailov, Salia Asanova

TL;DR
This paper investigates the robustness of large language models to minimal prompt changes, revealing they often ignore significant semantic shifts while overreacting to trivial noise, highlighting a need for improved evaluation protocols.
Contribution
It introduces a systematic method to differentiate between harmless noise and meaningful semantic changes in LLM prompts, revealing current models' over-robustness and insensitivity issues.
Findings
Models remain correct in 85% of cases despite 90% prompt removal.
Models react to only 54% of critical quantifier flips.
Jargon inflation causes 56% correct responses, indicating mixed sensitivity.
Abstract
Large language models (LLMs) now write code in settings where misreading a single word can break safety or cost money, yet we still expect them to overlook stray typos. To probe where useful robustness ends and harmful insensitivity begins, we compile 50 LeetCode problems and craft three minimal prompt perturbations that should vary in importance: (i) progressive underspecification deleting 10 % of words per step; (ii) lexical flip swapping a pivotal quantifier ("max" to "min"); and (iii) jargon inflation replacing a common noun with an obscure technical synonym. Six frontier models, including three "reasoning-tuned" versions, solve each mutated prompt, and their Python outputs are checked against the original test suites to reveal whether they reused the baseline solution or adapted. Among 11 853 generations we observe a sharp double asymmetry. Models remain correct in 85 % of cases…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
