Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic Prompting
Roland M\"uhlenbernd

TL;DR
This paper investigates whether large language models can quantitatively approximate human social reasoning and how pragmatic prompting strategies can enhance this alignment, focusing on structure and magnitude calibration.
Contribution
It introduces new metrics for assessing social inference fidelity in LLMs and demonstrates how pragmatic prompting can improve their social reasoning calibration.
Findings
LLMs reliably reproduce the qualitative structure of human social inferences.
Models differ significantly in magnitude calibration accuracy.
Combining speaker knowledge and alternative-awareness prompts improves calibration metrics.
Abstract
Large language models (LLMs) increasingly exhibit human-like patterns of pragmatic and social reasoning. This paper addresses two related questions: do LLMs approximate human social meaning not only qualitatively but also quantitatively, and can prompting strategies informed by pragmatic theory improve this approximation? To address the first, we introduce two calibration-focused metrics distinguishing structural fidelity from magnitude calibration: the Effect Size Ratio (ESR) and the Calibration Deviation Score (CDS). To address the second, we derive prompting conditions from two pragmatic assumptions: that social meaning arises from reasoning over linguistic alternatives, and that listeners infer speaker knowledge states and communicative motives. Applied to a case study on numerical (im)precision across three frontier LLMs, we find that all models reliably reproduce the qualitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
