# Promises and pitfalls of using LLMs to identify actor stances in political discourse

**Authors:** Viviane Walker, Mario Angst, Thomas Sanchez, Thomas Sanchez, Thomas Sanchez

PMC · DOI: 10.1371/journal.pone.0335547 · 2025-11-19

## TL;DR

This paper explores how large language models can detect stances in political discourse, highlighting both their potential and limitations.

## Contribution

The paper introduces a method for generalized zero-shot stance detection using LLMs and evaluates its effectiveness across different prompts and models.

## Key findings

- LLMs can achieve adequate performance in stance detection when using appropriate prompt chains.
- Results vary significantly depending on the LLM and the specific normative statement being analyzed.
- Domain-specific evaluation data is crucial for assessing LLMs in stance detection tasks.

## Abstract

Empirical research in the social sciences is often interested in understanding actor stances; the positions that social actors take regarding normative statements in societal discourse. In automated text analysis applications, the classification task of stance detection remains challenging. Stance detection is especially difficult due to semantic challenges such as implicitness or missing context but also due to the general nature of the task. In this paper, we explore the potential of Large Language Models (LLMs) to enable stance detection in a generalized (non-domain, non-statement specific) form. Specifically, we test a variety of different general prompt chains for zero-shot stance classifications. Our evaluation data consists of textual data from a real-world empirical research project in the domain of sustainable urban transport. For 1710 German newspaper paragraphs, each containing an organizational entity, we annotated the stance of the entity toward one of five normative statements. A comparison of four publicly available LLMs show that they can achieve adequate performance. However, results heavily depend on the prompt chain method, LLM, and vary by statement. Our findings have implications for computational linguistics methodology and political discourse analysis, as they offer a deeper understanding of the strengths and weaknesses of LLMs in performing the complex semantic task of stance detection. We strongly emphasise the necessity of domain-specific evaluation data for evaluating LLMs, considering trade-offs between model complexity and performance, as well as honestly weighing drawbacks of LLM application against traditional, valid approaches, such as manually annotating representative text samples.

## Full-text entities

- **Diseases:** LLMs (MESH:D007806)
- **Chemicals:** PONE-D-25-07214R1 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

36 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12629487/full.md

---
Source: https://tomesphere.com/paper/PMC12629487