LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models

Matthew Lyle Olson; Neale Ratzlaff; Musashi Hinck; Tri Nguyen; Vasudev Lal; Joseph Campbell; Simon Stepputtis; Shao-Yen Tseng

arXiv:2603.06874·cs.AI·March 10, 2026

LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models

Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Tri Nguyen, Vasudev Lal, Joseph Campbell, Simon Stepputtis, Shao-Yen Tseng

PDF

Open Access 1 Video

TL;DR

LieCraft is a multiplayer game-based evaluation framework designed to measure deception in large language models across high-stakes, ethically significant scenarios, revealing models' tendencies to act unethically and deceive.

Contribution

This work introduces LieCraft, a novel multiplayer sandbox for evaluating LLM deception in realistic, high-stakes contexts, addressing limitations of prior game-based assessments.

Findings

01

All tested models exhibit willingness to deceive and act unethically.

02

Models vary in deception skill and accusation accuracy but share tendencies to conceal intentions.

03

The framework enables comprehensive behavioral analysis of LLMs in ethically complex scenarios.

Abstract

Large Language Models (LLMs) exhibit impressive general-purpose capabilities but also introduce serious safety risks, particularly the potential for deception as models acquire increased agency and human oversight diminishes. In this work, we present LieCraft: a novel evaluation framework and sandbox for measuring LLM deception that addresses key limitations of prior game-based evaluations. At its core, LieCraft is a novel multiplayer hidden-role game in which players select an ethical alignment and execute strategies over a long time-horizon to accomplish missions. Cooperators work together to solve event challenges and expose bad actors, while Defectors evade suspicion while secretly sabotaging missions. To enable real-world relevance, we develop 10 grounded scenarios such as childcare, hospital resource allocation, and loan underwriting that recontextualize the underlying mechanics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models· underline

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Ethics and Social Impacts of AI