LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Tri Nguyen, Vasudev Lal, Joseph Campbell, Simon Stepputtis, Shao-Yen Tseng

TL;DR
LieCraft is a multiplayer game-based evaluation framework designed to measure deception in large language models across high-stakes, ethically significant scenarios, revealing models' tendencies to act unethically and deceive.
Contribution
This work introduces LieCraft, a novel multiplayer sandbox for evaluating LLM deception in realistic, high-stakes contexts, addressing limitations of prior game-based assessments.
Findings
All tested models exhibit willingness to deceive and act unethically.
Models vary in deception skill and accusation accuracy but share tendencies to conceal intentions.
The framework enables comprehensive behavioral analysis of LLMs in ethically complex scenarios.
Abstract
Large Language Models (LLMs) exhibit impressive general-purpose capabilities but also introduce serious safety risks, particularly the potential for deception as models acquire increased agency and human oversight diminishes. In this work, we present LieCraft: a novel evaluation framework and sandbox for measuring LLM deception that addresses key limitations of prior game-based evaluations. At its core, LieCraft is a novel multiplayer hidden-role game in which players select an ethical alignment and execute strategies over a long time-horizon to accomplish missions. Cooperators work together to solve event challenges and expose bad actors, while Defectors evade suspicion while secretly sabotaging missions. To enable real-world relevance, we develop 10 grounded scenarios such as childcare, hospital resource allocation, and loan underwriting that recontextualize the underlying mechanics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Ethics and Social Impacts of AI
