TL;DR
PICon is a systematic multi-turn interrogation framework designed to evaluate the consistency of persona agents across internal, external, and retest dimensions, revealing contradictions and evasive responses.
Contribution
It introduces a novel evaluation methodology applying logical chaining to assess persona agent consistency, filling a gap in systematic verification methods.
Findings
Most persona agents fail to meet human-level consistency standards.
Chained questioning exposes contradictions and evasive responses in systems previously deemed consistent.
The framework provides a practical tool for rigorous evaluation of persona agents.
Abstract
Large language model (LLM)-based persona agents are rapidly being adopted as scalable proxies for human participants across diverse domains. Yet there is no systematic method for verifying whether a persona agent's responses remain free of contradictions and factual inaccuracies throughout an interaction. A principle from interrogation methodology offers a lens: no matter how elaborate a fabricated identity, systematic interrogation will expose its contradictions. We apply this principle to propose PICon, an evaluation framework that probes persona agents through logically chained multi-turn questioning. PICon evaluates consistency along three core dimensions: internal consistency (freedom from self-contradiction), external consistency (alignment with real-world facts), and retest consistency (stability under repetition). Evaluating seven groups of persona agents alongside 63 real human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
