Truth or Tribe: How In-group Favoritism Prioritize Facts in Persona Agents
Shijun Lei, Hongyu Wang, Yunji Liang, Haowen Zheng, Bin Guo, Zhiwen Yu

TL;DR
This paper investigates in-group favoritism in persona agents, revealing its persistence in misinformation contexts and proposing three strategies to mitigate its effects.
Contribution
It introduces a novel simulation framework to study in-group bias in persona agents and evaluates interventions to reduce this bias.
Findings
Persona agents favor in-group members, accepting false info from them more than from out-group members.
In-group favoritism persists in ambiguous truth scenarios and increases with cognitive complexity.
Three mitigation strategies effectively reduce in-group bias in persona agents.
Abstract
In-group favoritism refers to the phenomena of favoring members of one's in-group over out-group members and is widely observed in numerous social cooperative behaviors. Recently, in-group favoritism biases have also been identified in generative language models. However, whether the in-group favoritism exists when persona agents are faced with contradicting information (e.g., misinformation), and how to mitigate the adverse effects of in-group favoritism biases in persona agents have been understudied. To address these problems, we propose a Truth or Tribe simulation framework to study the agent cooperation within the spread of contradicting information through a triadic interaction paradigm, and conduct controlled trials to evaluate the primary moderating factors. Extensive results showcase that persona agents display strong in-group favoritism, accepting incorrect answers from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
