Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification
Akram Elbouanani, Evan Dufraisse, Adrian Popescu

TL;DR
This paper introduces a novel entropy-based metric to analyze political bias in large language models by examining target-oriented sentiment prediction inconsistencies across multiple languages and model sizes.
Contribution
It proposes a new bias analysis method that leverages prediction variability with diverse politician names, revealing biases and language effects in LLMs.
Findings
Bias exists across all tested models and languages.
Larger models show stronger biases and consistency.
Western languages exhibit higher bias intensity.
Abstract
Political biases encoded by LLMs might have detrimental effects on downstream applications. Existing bias analysis methods rely on small-size intermediate tasks (questionnaire answering or political content generation) and rely on the LLMs themselves for analysis, thus propagating bias. We propose a new approach leveraging the observation that LLM sentiment predictions vary with the target entity in the same sentence. We define an entropy-based inconsistency metric to encode this prediction variability. We insert 1319 demographically and politically diverse politician names in 450 political sentences and predict target-oriented sentiment using seven models in six widely spoken languages. We observe inconsistencies in all tested combinations and aggregate them in a statistically robust analysis at different granularity levels. We observe positive and negative bias toward left and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining
