PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models
Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas, Hartvigsen, Maarten Sap

TL;DR
This paper introduces PolygloToxicityPrompts, a large-scale multilingual benchmark with 425,000 prompts across 17 languages, to evaluate toxicity in large language models and analyze factors influencing toxicity levels.
Contribution
It presents the first comprehensive multilingual toxicity benchmark for LLMs, covering 17 languages and over 100 million web-text documents, and investigates how model size and tuning methods affect toxicity.
Findings
Toxicity increases with decreasing language resources and larger model sizes.
Instruction- and preference-tuning reduce toxicity, but preference-tuning method choice has minimal impact.
Multilingual evaluation reveals critical shortcomings in current LLM safety measures.
Abstract
Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations. However, existing toxicity benchmarks are overwhelmingly focused on English, posing serious risks to deploying LLMs in other languages. We address this by introducing PolygloToxicityPrompts (PTP), the first large-scale multilingual toxicity evaluation benchmark of 425K naturally occurring prompts spanning 17 languages. We overcome the scarcity of naturally occurring toxicity in web-text and ensure coverage across languages with varying resources by automatically scraping over 100M web-text documents. Using PTP, we investigate research questions to study the impact of model size, prompt language, and instruction and preference-tuning methods on toxicity by benchmarking over 60 LLMs. Notably, we find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies
