A Multi-Dimensional Audit of Politically Aligned Large Language Models
Lisa Korver, Mohamed Mostagir, Sherief Reda

TL;DR
This paper introduces a multi-dimensional framework to audit politically aligned large language models across effectiveness, fairness, truthfulness, and persuasiveness, revealing trade-offs and deficiencies in current models.
Contribution
It proposes a novel, quantitative auditing framework inspired by Habermas' theory to evaluate political alignment in LLMs across multiple dimensions.
Findings
Larger models are more effective and truthful but less fair and more biased.
Fine-tuned models show lower bias but reduced reasoning performance and more hallucinations.
All models exhibit deficiencies in at least one of the four evaluated metrics.
Abstract
As the application of Large Language Models (LLMs) spreads across various industries, there are increasing concerns about the potential for their misuse, especially in sensitive areas such as political discourse. Deliberately aligning LLMs with specific political ideologies, through prompt engineering or fine-tuning techniques, can be advantageous in use cases such as political campaigns, but requires careful consideration due to heightened risks of performance degradation, misinformation, or increased biased behavior. In this work, we propose a multi-dimensional framework inspired by Habermas' Theory of Communicative Action to audit politically aligned language models across four dimensions: effectiveness, fairness, truthfulness, and persuasiveness using automated, quantitative metrics. Applying this to nine popular LLMs aligned via fine-tuning or role-playing revealed consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
