Reasoning Boosts Opinion Alignment in LLMs

Fr\'ed\'eric Berdoz; Yann Billeter; Yann Vonlanthen; Roger Wattenhofer

arXiv:2603.01214·cs.CL·March 13, 2026

Reasoning Boosts Opinion Alignment in LLMs

Fr\'ed\'eric Berdoz, Yann Billeter, Yann Vonlanthen, Roger Wattenhofer

PDF

Open Access 3 Reviews

TL;DR

This paper investigates how structured reasoning can improve large language models' ability to produce opinion-aligned responses in political contexts, demonstrating that reasoning enhances bias mitigation but does not eliminate it entirely.

Contribution

The study introduces a reasoning-based training approach for LLMs to improve opinion alignment and provides datasets and benchmarks for future research.

Findings

01

Reasoning improves opinion alignment in LLMs.

02

Models trained with reasoning are competitive with strong baselines.

03

Bias is reduced but not fully eliminated by reasoning.

Abstract

Opinion modeling aims to capture individual or group political preferences, enabling applications such as digital democracies, where models could help shape fairer and more popular policies. Given their versatility, strong generalization capabilities, and demonstrated success across diverse text-to-text applications, large language models (LLMs) are natural candidates for this task. However, due to their statistical nature and limited causal understanding, they tend to produce biased opinions when prompted naively. In this work, we study whether reasoning can improve opinion alignment. Motivated by the recent advancement in mathematical reasoning enabled by reinforcement learning (RL), we train models to produce profile-consistent answers through structured reasoning. We evaluate our approach on three datasets covering U.S., European, and Swiss politics. Results indicate that reasoning…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

* **Clear, reproducible method and reward design.** The structured format and composite reward are explicit, with training hyperparameters (LoRA, steps, schedulers) fully reported. * **Multiple datasets and ideologies.** The paper evaluates on smartvote (binary), Wahl-o-Mat (ternary), and ANES (ternary), with clear unit definitions and splits. * **Variance reporting across 8 stochastic runs.** Results include mean ± s.d., improving robustness. * **Insightful analyses of failure modes and class i

Weaknesses

1. **Missing strong baselines for political alignment.** The paper omits comparisons against recent specialized methods that align LLMs to political viewpoints using supervised preference optimization or domain corpora (e.g., Stammbach et al. 2024). 2. **Limited persona-conditioning alternatives.** The method trains *one model per persona*, not comparing to shared-parameter persona embedding methods. 3. **Synthetic SFT rationales risk leakage/priors.** Synthetic arguments generated by a large

Reviewer 02Rating 6Confidence 3

Strengths

1. First systematic study of using RL-based reasoning for individual-level political opinion modeling, moving beyond demographic-based approaches 2. Three datasets from different political systems (US, Germany, Switzerland) provide robust cross-cultural validation 3. The paper identifies a gap between demographic-prompted political simulation (Santurkar et al., 2023; Argyle et al., 2023) and individual-level agents that must stay consistent with a known survey profile.

Weaknesses

1. Training one model per individual is computationally prohibitive for real-world deployment 2. The method still trains one model per persona, which is expensive and does not scale to population-level simulation; the paper acknowledges this, but the main method remains hard to apply in real digital-democracy settings where thousands of agents are needed.

Reviewer 03Rating 4Confidence 3

Strengths

1. The paper studies a very relevant question — how reasoning affects opinion alignment, which feels timely and important. 2. The method is clear and builds nicely on recent RL reasoning techniques. 3. The experiments are well-organized and tested across multiple real-world political datasets. 4. The analysis is thoughtful, especially the discussion of ideological bias and neutral-class difficulty.

Weaknesses

1. The clarity of paper should be further refined, and there are several typos in the content. 2. It’s not entirely clear whether GRPO genuinely improves reasoning or just fits survey patterns better. 3. The paper shows biases but doesn’t really explain their sources or propose fixes.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Sentiment Analysis and Opinion Mining · Topic Modeling