Robustness of Large Language Models to Perturbations in Text
Ayush Singh, Navpreet Singh, Shubham Vatsal

TL;DR
This paper investigates the robustness of large language models to noisy and perturbed text, finding they are surprisingly resilient compared to traditional models, and demonstrates their state-of-the-art performance on real-world benchmarks.
Contribution
The study systematically evaluates LLMs' robustness to text perturbations and introduces new benchmarks and datasets for future research in noisy text scenarios.
Findings
LLMs are robust to morphological noise in text.
LLMs outperform traditional models like BERT and RoBERTa on noisy data.
State-of-the-art results achieved in Grammar Error Correction and Lexical Semantic Change tasks.
Abstract
Having a clean dataset has been the foundational assumption of most natural language processing (NLP) systems. However, properly written text is rarely found in real-world scenarios and hence, oftentimes invalidates the aforementioned foundational assumption. Recently, Large language models (LLMs) have shown impressive performance, but can they handle the inevitable noise in real-world data? This work tackles this critical question by investigating LLMs' resilience against morphological variations in text. To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs' robustness against the corrupt variations of the original text. Our findings show that contrary to popular beliefs, generative LLMs are quiet robust to noisy perturbations in text. This is a departure from pre-trained models like BERT or RoBERTa whose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Residual Connection · Layer Normalization · Linear Layer · RoBERTa · Attention Dropout · Linear Warmup With Linear Decay · Adam
