LLM-based Human Simulations Have Not Yet Been Reliable
Qian Wang, Jiaying Wu, Zichen Jiang, Zhenheng Tang, Bingqiao Luo, Nuo Chen, Wei Chen, Bingsheng He

TL;DR
Current LLM-based human simulations are unreliable due to inherent model limitations and flawed design, requiring systematic improvements in data, capabilities, and simulation methods to achieve credible, human-aligned outcomes.
Contribution
This paper provides a comprehensive review of LLM-based human simulations, identifies key limitations, and proposes a systematic framework and algorithm to improve their reliability.
Findings
Discrepancies between LLM simulations and real human actions are significant.
Limitations stem from inherent LLM constraints and simulation design flaws.
A structured framework and algorithm are proposed to enhance simulation reliability.
Abstract
Large Language Models (LLMs) are increasingly employed for simulating human behaviors across diverse domains. However, our position is that current LLM-based human simulations remain insufficiently reliable, as evidenced by significant discrepancies between their outcomes and authentic human actions. Our investigation begins with a systematic review of LLM-based human simulations in social, economic, policy, and psychological contexts, identifying their common frameworks, recent advances, and persistent limitations. This review reveals that such discrepancies primarily stem from inherent limitations of LLMs and flaws in simulation design, both of which are examined in detail. Building on these insights, we propose a systematic solution framework that emphasizes enriching data foundations, advancing LLM capabilities, and ensuring robust simulation design to enhance reliability. Finally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEngineering Technology and Methodologies · Human-Automation Interaction and Safety
