Exploring Robustness of Multilingual LLMs on Real-World Noisy Data

Amirhossein Aliakbarzadeh; Lucie Flek; Akbar Karimi

arXiv:2501.08322·cs.CL·January 15, 2025

Exploring Robustness of Multilingual LLMs on Real-World Noisy Data

Amirhossein Aliakbarzadeh, Lucie Flek, Akbar Karimi

PDF

Open Access 1 Repo 1 Video

TL;DR

This study evaluates how well large multilingual language models handle real-world spelling errors across multiple languages and tasks, revealing that models like mT5 are more robust than others, with performance drops of 2.3 to 4.3 percentage points.

Contribution

The paper provides a comprehensive analysis of the robustness of 9 multilingual LLMs to real-world spelling noise across three NLP tasks and six languages, highlighting the superior robustness of mT5 models.

Findings

01

mT5 (13B) is the most robust model overall.

02

Performance gap between clean and noisy data ranges from 2.3 to 4.3 percentage points.

03

Robustness varies across models and languages, with mT5 outperforming others.

Abstract

Large Language Models (LLMs) are trained on Web data that might contain spelling errors made by humans. But do they become robust to similar real-world noise? In this paper, we investigate the effect of real-world spelling mistakes on the performance of 9 language models, with parameters ranging from 0.2B to 13B, in 3 different NLP tasks, namely Natural Language Inference (NLI), Name Entity Recognition (NER), and Intent Classification (IC). We perform our experiments on 6 different languages and build a dictionary of real-world noise for them using the Wikipedia edit history. We show that the performance gap of the studied models on the clean and noisy test data averaged across all the datasets and languages ranges from 2.3 to 4.3 absolute percentage points. In addition, mT5 models, in general, show more robustness compared to BLOOM, Falcon, and BERT-like models. In particular, mT5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

caisa-lab/llms-real-world-noise-robustness
pytorchOfficial

Videos

Exploring Robustness of Multilingual LLMs on Real-World Noisy Data· underline

Taxonomy

TopicsNatural Language Processing Techniques · Data Mining Algorithms and Applications · Rough Sets and Fuzzy Logic

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Gated Linear Unit · Residual Connection · Dropout · SentencePiece · Softmax · Linear Layer · Inverse Square Root Schedule