Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
Yejin Bang, Delong Chen, Nayeon Lee, Pascale Fung

TL;DR
This paper introduces a framework for measuring political bias in large language models by analyzing both content and stylistic aspects across various political issues, aiming for transparency and scalability.
Contribution
It presents a novel, explainable, and scalable method for assessing political bias in LLMs, covering both content and style, across multiple political topics.
Findings
Bias varies significantly across models and issues
The framework is scalable to new topics and models
It provides transparent and explainable bias measurements
Abstract
We propose to measure political bias in LLMs by analyzing both the content and style of their generated content regarding political issues. Existing benchmarks and measures focus on gender and racial biases. However, political bias exists in LLMs and can lead to polarization and other harms in downstream applications. In order to provide transparency to users, we advocate that there should be fine-grained and explainable measures of political biases generated by LLMs. Our proposed measure looks at different political issues such as reproductive rights and climate change, at both the content (the substance of the generation) and the style (the lexical polarity) of such bias. We measured the political bias in eleven open-sourced LLMs and showed that our proposed framework is easily scalable to other topics and is explainable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsFocus
