GermanPartiesQA: Benchmarking Commercial Large Language Models and AI Companions for Political Alignment and Sycophancy

Jan Batzner; Volker Stocker; Stefan Schmid; Gjergji Kasneci

arXiv:2407.18008·cs.CY·December 2, 2025

GermanPartiesQA: Benchmarking Commercial Large Language Models and AI Companions for Political Alignment and Sycophancy

Jan Batzner, Volker Stocker, Stefan Schmid, Gjergji Kasneci

PDF

TL;DR

This paper introduces GermanPartiesQA, a benchmark for evaluating commercial large language models' political alignment and bias, revealing their factual limitations, ideological tendencies, and steerability in role-playing scenarios.

Contribution

It presents a new benchmark dataset and evaluation methodology for assessing political alignment and biases in commercial LLMs used in decision support tools.

Findings

01

LLMs have limited accuracy in representing factual party positions.

02

Models exhibit consistent ideological alignment patterns.

03

Models' responses reflect persona-based steerability, not true sycophancy.

Abstract

Large language models (LLMs) are increasingly shaping citizens' information ecosystems. Products incorporating LLMs, such as chatbots and AI Companions, are now widely used for decision support and information retrieval, including in sensitive domains, raising concerns about hidden biases and growing potential to shape individual decisions and public opinion. This paper introduces GermanPartiesQA, a benchmark of 418 political statements from German Voting Advice Applications across 11 elections to evaluate six commercial LLMs. We evaluate their political alignment based on role-playing experiments with political personas. Our evaluation reveals three specific findings: (1) Factual limitations: LLMs show limited ability to accurately generate factual party positions, particularly for centrist parties. (2) Model-specific ideological alignment: We identify consistent alignment patterns and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Model