Validating Generative Agent-Based Models of Social Norm Enforcement: From Replication to Novel Predictions
Logan Cross, Nick Haber, Daniel L.K. Yamins

TL;DR
This paper develops a systematic validation framework for generative agent-based models of social norm enforcement, demonstrating their ability to replicate known behaviors and generate novel, testable social science predictions.
Contribution
It introduces a two-stage validation approach for LLM-based social behavior models and shows how validated models can produce new insights into social norm enforcement.
Findings
Both persona-based differences and theory of mind are essential for replicating third-party punishment.
Reputational information via gossip increases cooperation in public goods games.
Anonymous punishment reduces third-party punishment rates but does not eliminate them.
Abstract
As large language models (LLMs) advance, there is growing interest in using them to simulate human social behavior through generative agent-based modeling (GABM). However, validating these models remains a key challenge. We present a systematic two-stage validation approach using social dilemma paradigms from psychological literature, first identifying the cognitive components necessary for LLM agents to reproduce known human behaviors in mixed-motive settings from two landmark papers, then using the validated architecture to simulate novel conditions. Our model comparison of different cognitive architectures shows that both persona-based individual differences and theory of mind capabilities are essential for replicating third-party punishment (TPP) as a costly signal of trustworthiness. For the second study on public goods games, this architecture is able to replicate an increase in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
