LPDS: Evaluating LLM Robustness Through Logic-Preserving Difficulty Scaling
Philipp Mondorf, Samuel J. Bell, Jesse Dodge, Dieuwke Hupkes

TL;DR
This paper introduces LPDS, a framework for systematically evaluating LLM robustness by identifying and testing the most challenging logic-preserving variations, revealing significant performance drops and guiding more effective fine-tuning.
Contribution
LPDS provides a systematic method to quantify and find the most difficult problem variations, improving robustness evaluation and training strategies for LLMs.
Findings
Performance drops up to 5 times larger with LPDS compared to random sampling.
LPDS efficiently finds difficult variations that induce model failures.
Fine-tuning on difficult variations yields more consistent robustness improvements.
Abstract
As large language models (LLMs) are increasingly deployed to perform tasks with minimal human oversight, it is crucial that these models operate robustly. In particular, a model that can solve a given problem should not fail simply because certain entitiessuch as names, numbers, or other contextual detailshave changed while the underlying problem logic remains the same. Prior work suggests that current LLMs still struggle with this form of robustness: they often succeed on some variations of a problem but fail on others. However, existing evaluations often lack a systematic way to identify which logic-preserving variations are most likely to induce failure. Instead, they typically test a random subset of allowable variations, which can overstate robustness. To address this gap, we introduce logic-preserving difficulty scaling (LPDS), a framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
