LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Inconsistencies

Felix Friedrich; Simone Tedeschi; Patrick Schramowski; Manuel Brack; Roberto Navigli; Huu Nguyen; Bo Li; Kristian Kersting

arXiv:2412.15035·cs.CL·June 24, 2025

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Inconsistencies

Felix Friedrich, Simone Tedeschi, Patrick Schramowski, Manuel Brack, Roberto Navigli, Huu Nguyen, Bo Li, Kristian Kersting

PDF

Open Access 1 Datasets

TL;DR

This paper introduces M-ALERT, a multilingual safety benchmark for LLMs, revealing significant cross-linguistic safety inconsistencies and emphasizing the need for language-specific safety evaluations.

Contribution

The paper presents M-ALERT, a comprehensive multilingual safety benchmark with 75k prompts across five languages, and highlights safety inconsistencies in 39 state-of-the-art LLMs.

Findings

01

Models often show safety inconsistencies across languages.

02

Certain categories like substance_cannabis are universally unsafe.

03

Language-specific safety issues are prevalent in current LLMs.

Abstract

Building safe Large Language Models (LLMs) across multiple languages is essential in ensuring both safe access and linguistic diversity. To this end, we conduct a large-scale, comprehensive safety evaluation of the current LLM landscape. For this purpose, we introduce M-ALERT, a multilingual benchmark that evaluates the safety of LLMs in five languages: English, French, German, Italian, and Spanish. M-ALERT includes 15k high-quality prompts per language, totaling 75k, with category-wise annotations. Our extensive experiments on 39 state-of-the-art LLMs highlight the importance of language-specific safety analysis, revealing that models often exhibit significant inconsistencies in safety across languages and categories. For instance, Llama3.2 shows high unsafety in category crime_tax for Italian but remains safe in other languages. Similar inconsistencies can be observed across all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

felfri/M-ALERT
dataset· 46 dl
46 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Library Science and Information Systems