LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models

Zhiyuan Ning; Tianle Gu; Jiaxin Song; Shixin Hong; Lingyu Li; Huacan Liu; Jie Li; Yixu Wang; Meng Lingyu; Yan Teng; Yingchun Wang

arXiv:2508.12733·cs.CL·August 28, 2025

LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models

Zhiyuan Ning, Tianle Gu, Jiaxin Song, Shixin Hong, Lingyu Li, Huacan Liu, Jie Li, Yixu Wang, Meng Lingyu, Yan Teng, Yingchun Wang

PDF

TL;DR

LinguaSafe introduces a large, diverse multilingual safety benchmark with 45,000 entries across 12 languages, enabling more comprehensive safety evaluations of large language models in various linguistic and cultural contexts.

Contribution

This work provides the first extensive multilingual safety benchmark with detailed evaluation metrics, addressing the lack of diverse safety assessments for under-represented languages.

Findings

01

Safety and helpfulness vary across languages and domains.

02

Multilingual safety assessments reveal significant differences even among resource-similar languages.

03

The benchmark facilitates more balanced safety alignment in LLMs.

Abstract

The widespread adoption and increasing prominence of large language models (LLMs) in global technologies necessitate a rigorous focus on ensuring their safety across a diverse range of linguistic and cultural contexts. The lack of a comprehensive evaluation and diverse data in existing multilingual safety evaluations for LLMs limits their effectiveness, hindering the development of robust multilingual safety alignment. To address this critical gap, we introduce LinguaSafe, a comprehensive multilingual safety benchmark crafted with meticulous attention to linguistic authenticity. The LinguaSafe dataset comprises 45k entries in 12 languages, ranging from Hungarian to Malay. Curated using a combination of translated, transcreated, and natively-sourced data, our dataset addresses the critical need for multilingual safety evaluations of LLMs, filling the void in the safety evaluation of LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.