Evaluating and Mitigating Linguistic Discrimination in Large Language   Models

Guoliang Dong; Haoyu Wang; Jun Sun; Xinyu Wang

arXiv:2404.18534·cs.CL·May 13, 2024·1 cites

Evaluating and Mitigating Linguistic Discrimination in Large Language Models

Guoliang Dong, Haoyu Wang, Jun Sun, Xinyu Wang

PDF

Open Access

TL;DR

This paper investigates linguistic discrimination in large language models, revealing disparities in safety and quality across languages, and proposes LDFighter, a method to improve consistency and mitigate biases.

Contribution

The study provides a comprehensive analysis of language-based disparities in LLM outputs and introduces LDFighter, a novel similarity-based voting method to enhance fairness and response quality.

Findings

01

LLMs are more aligned and safer in English, French, Russian, and Spanish.

02

Languages like Bengali, Georgian, Nepali, and Maithili show higher jailbreak success rates.

03

LDFighter reduces jailbreak success and improves response quality across languages.

Abstract

By training on text in various languages, large language models (LLMs) typically possess multilingual support and demonstrate remarkable capabilities in solving tasks described in different languages. However, LLMs can exhibit linguistic discrimination due to the uneven distribution of training data across languages. That is, LLMs are hard to keep the consistency of responses when faced with the same task but depicted in different languages. In this study, we first explore the consistency in the LLMs' outputs responding to queries in various languages from two aspects: safety and quality. We conduct this analysis with two datasets (AdvBench and NQ) based on four LLMs (Llama2-13b, Gemma-7b, GPT-3.5-turbo and Gemini-pro). The results show that LLMs exhibit stronger human alignment capabilities with queries in English, French, Russian, and Spanish (only 1.04\% of harmful queries…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · travel james · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Adam · Layer Normalization · Attention Dropout · Multi-Head Attention · Cosine Annealing