With a Grain of SALT: Are LLMs Fair Across Social Dimensions?
Samee Arif, Zohaib Khan, Maaidah Kaleem, Suhaib Rashid, Agha Ali Raza,, Awais Athar

TL;DR
This paper systematically analyzes biases in open-source LLMs across social dimensions using the SALT benchmark, revealing consistent polarization and emphasizing the need for bias mitigation.
Contribution
It introduces SALT, a comprehensive bias benchmark for LLMs, and provides a detailed analysis of biases across multiple social groups and contexts.
Findings
Models show systematic bias favoring or disfavoring certain groups.
Bias varies across different social dimensions and contexts.
Evaluation biases are identified and addressed through human validation.
Abstract
This paper presents a systematic analysis of biases in open-source Large Language Models (LLMs), across gender, religion, and race. Our study evaluates bias in smaller-scale Llama and Gemma models using the SALT (ocial ppropriateness in LM-Generated ext) dataset, which incorporates five distinct bias triggers: General Debate, Positioned Debate, Career Advice, Problem Solving, and CV Generation. To quantify bias, we measure win rates in General Debate and the assignment of negative roles in Positioned Debate. For real-world use cases, such as Career Advice, Problem Solving, and CV Generation, we anonymize the outputs to remove explicit demographic identifiers and use DeepSeek-R1 as an automated evaluator. We also address inherent biases in LLM-based evaluation, including evaluation bias, positional bias, and length bias, and validate our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCooperative Studies and Economics · Private Equity and Venture Capital · FinTech, Crowdfunding, Digital Finance
MethodsLLaMA · Sparse Evolutionary Training
