CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety Applications

Raviraj Joshi; Rakesh Paul; Kanishk Singla; Anusha Kamath; Michael Evans; Katherine Luna; Shaona Ghosh; Utkarsh Vaidya; Eileen Long; Sanjay Singh Chauhan; Niranjan Wartikar

arXiv:2508.01710·cs.CL·November 11, 2025

CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety Applications

Raviraj Joshi, Rakesh Paul, Kanishk Singla, Anusha Kamath, Michael Evans, Katherine Luna, Shaona Ghosh, Utkarsh Vaidya, Eileen Long, Sanjay Singh Chauhan, Niranjan Wartikar

PDF

Open Access 1 Models 1 Datasets

TL;DR

CultureGuard introduces a pipeline for creating culturally aligned safety datasets in multiple languages, enabling the training of multilingual safety guard models that outperform existing benchmarks and generalize well across languages.

Contribution

We propose a novel four-stage pipeline for generating high-quality, culturally aligned safety datasets across multiple languages, facilitating multilingual safety guard model development.

Findings

01

The dataset contains 386,661 samples across 9 languages.

02

The trained model achieves state-of-the-art safety benchmark performance.

03

Multilingual fine-tuning improves cross-lingual transfer and zero-shot generalization.

Abstract

The increasing use of Large Language Models (LLMs) in agentic applications highlights the need for robust safety guard models. While content safety in English is well-studied, non-English languages lack similar advancements due to the high cost of collecting culturally aligned labeled datasets. We present CultureGuard, a novel solution for curating culturally aligned, high-quality safety datasets across multiple languages. Our approach introduces a four-stage synthetic data generation and filtering pipeline: cultural data segregation, cultural data adaptation, machine translation, and quality filtering. This pipeline enables the conversion and expansion of the Nemotron-Content-Safety-Dataset-V2 English safety dataset into eight distinct languages: Arabic, German, Spanish, French, Hindi, Japanese, Thai, and Chinese. The resulting dataset, Nemotron-Safety-Guard-Dataset-v3, comprises…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3
model· 1.6k dl· ♡ 16
1.6k dl♡ 16

Datasets

nvidia/Nemotron-Safety-Guard-Dataset-v3
dataset· 1.6k dl
1.6k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Occupational Health and Safety Research