LIME-LLM: Probing Models with Fluent Counterfactuals, Not Broken Text

George Mihaila; Suleyman Olcay Polat; Poli Nemkova; Himanshu Sharma; Namratha V. Urs; Mark V. Albert

arXiv:2601.11746·cs.CL·January 21, 2026

LIME-LLM: Probing Models with Fluent Counterfactuals, Not Broken Text

George Mihaila, Suleyman Olcay Polat, Poli Nemkova, Himanshu Sharma, Namratha V. Urs, Mark V. Albert

PDF

Open Access

TL;DR

LIME-LLM introduces a hypothesis-driven perturbation framework for NLP explanations, replacing random noise with fluent, on-manifold neighborhoods to improve local explanation fidelity.

Contribution

It proposes a novel controlled perturbation method using LLMs that isolates feature effects more effectively than existing heuristic or generative approaches.

Findings

01

Outperforms baseline explanation methods in fidelity across multiple benchmarks.

02

Produces fluent, semantically valid neighborhood samples for better interpretability.

03

Establishes new state-of-the-art in NLP local explanation fidelity.

Abstract

Local explanation methods such as LIME (Ribeiro et al., 2016) remain fundamental to trustworthy AI, yet their application to NLP is limited by a reliance on random token masking. These heuristic perturbations frequently generate semantically invalid, out-of-distribution inputs that weaken the fidelity of local surrogate models. While recent generative approaches such as LLiMe (Angiulli et al., 2025b) attempt to mitigate this by employing Large Language Models for neighborhood generation, they rely on unconstrained paraphrasing that introduces confounding variables, making it difficult to isolate specific feature contributions. We introduce LIME-LLM, a framework that replaces random noise with hypothesis-driven, controlled perturbations. By enforcing a strict "Single Mask-Single Sample" protocol and employing distinct neutral infill and boundary infill strategies, LIME-LLM constructs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Artificial Intelligence in Healthcare and Education