Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, and Jong C. Park

TL;DR
This paper introduces a new method to directly measure social bias in large language models by quantifying social perceptions from diverse perspectives, enabling a more detailed understanding of biases.
Contribution
It proposes a novel strategy and metrics for directly assessing social perceptions and biases in LLMs, addressing limitations of previous indirect evaluation methods.
Findings
Metrics effectively capture multi-dimensional social biases
Quantitative analysis demonstrates social attitudes in LLMs
Enables fine-grained investigation of bias in models
Abstract
Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities. To fully understand such social bias in large language models (LLMs), it is essential to consider the composite of social perceptions from diverse perspectives among identities. Previous studies have either evaluated biases in LLMs by indirectly assessing the presence of sentiments towards demographic identities in the generated text or measuring the degree of alignment with given stereotypes. These methods have limitations in directly quantifying social biases at the level of distinct perspectives among identities. In this paper, we aim to investigate how social perceptions from various viewpoints contribute to the development of social bias in LLMs. To this end, we propose a novel strategy to intuitively quantify these social perceptions and suggest metrics that can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Computational and Text Analysis Methods · Authorship Attribution and Profiling
