Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to   Sensitivity in Large Language Models

Somnath Banerjee; Sayan Layek; Hari Shrawgi; Rajarshi Mandal; Avik; Halder; Shanu Kumar; Sagnik Basu; Parag Agrawal; Rima Hazra; Animesh; Mukherjee

arXiv:2410.12880·cs.CL·January 27, 2025

Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models

Somnath Banerjee, Sayan Layek, Hari Shrawgi, Rajarshi Mandal, Avik, Halder, Shanu Kumar, Sagnik Basu, Parag Agrawal, Rima Hazra, Animesh, Mukherjee

PDF

Open Access 1 Repo 3 Datasets 1 Video

TL;DR

This paper introduces datasets and methods to evaluate and improve cultural sensitivity in large language models, especially smaller ones, to promote ethical and respectful AI across diverse cultural contexts.

Contribution

It presents a cultural harm test dataset and a culturally aligned preference dataset for evaluating and fine-tuning LLMs to reduce cultural insensitivity.

Findings

01

Fine-tuning with culturally aligned feedback improves model sensitivity

02

Datasets effectively identify cultural insensitivities in LLM outputs

03

Enhanced models generate fewer culturally harmful responses

Abstract

As LLMs are increasingly deployed in global applications, the importance of cultural sensitivity becomes paramount, ensuring that users from diverse backgrounds feel respected and understood. Cultural harm can arise when these models fail to align with specific cultural norms, resulting in misrepresentations or violations of cultural values. This work addresses the challenges of ensuring cultural sensitivity in LLMs, especially in small-parameter models that often lack the extensive training data needed to capture global cultural nuances. We present two key contributions: (1) A cultural harm test dataset, created to assess model outputs across different cultural contexts through scenarios that expose potential cultural insensitivities, and (2) A culturally aligned preference dataset, aimed at restoring cultural sensitivity through fine-tuning based on feedback from diverse annotators.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neuralsentinel/culturalkaleidoscope
noneOfficial

Datasets

Videos

Navigating the Cultural Kaleidoscope: A Hitchhiker’s Guide to Sensitivity in Large Language Models· underline

Taxonomy

TopicsComputational and Text Analysis Methods

MethodsALIGN