Robustness of Large Language Models to Perturbations in Text

Ayush Singh; Navpreet Singh; Shubham Vatsal

arXiv:2407.08989·cs.CL·October 8, 2025·1 cites

Robustness of Large Language Models to Perturbations in Text

Ayush Singh, Navpreet Singh, Shubham Vatsal

PDF

Open Access

TL;DR

This paper investigates the robustness of large language models to noisy and perturbed text, finding they are surprisingly resilient compared to traditional models, and demonstrates their state-of-the-art performance on real-world benchmarks.

Contribution

The study systematically evaluates LLMs' robustness to text perturbations and introduces new benchmarks and datasets for future research in noisy text scenarios.

Findings

01

LLMs are robust to morphological noise in text.

02

LLMs outperform traditional models like BERT and RoBERTa on noisy data.

03

State-of-the-art results achieved in Grammar Error Correction and Lexical Semantic Change tasks.

Abstract

Having a clean dataset has been the foundational assumption of most natural language processing (NLP) systems. However, properly written text is rarely found in real-world scenarios and hence, oftentimes invalidates the aforementioned foundational assumption. Recently, Large language models (LLMs) have shown impressive performance, but can they handle the inevitable noise in real-world data? This work tackles this critical question by investigating LLMs' resilience against morphological variations in text. To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs' robustness against the corrupt variations of the original text. Our findings show that contrary to popular beliefs, generative LLMs are quiet robust to noisy perturbations in text. This is a departure from pre-trained models like BERT or RoBERTa whose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Residual Connection · Layer Normalization · Linear Layer · RoBERTa · Attention Dropout · Linear Warmup With Linear Decay · Adam