DVM: Towards Controllable LLM Agents in Social Deduction Games
Zheng Zhang, Yihuai Lan, Yangsen Chen, Lei Wang, Xiang Wang, Hao Wang

TL;DR
This paper introduces DVM, a framework that enables large language model agents to adapt their skill levels in social deduction games like Werewolf, balancing gameplay and ensuring safety and fairness.
Contribution
DVM is a novel framework that combines reinforcement learning with a win rate-constrained decision mechanism to control LLM agent proficiency in social deduction games.
Findings
DVM outperforms existing methods in Werewolf.
DVM can modulate performance to meet target win rates.
The framework enhances adaptive and balanced gameplay.
Abstract
Large Language Models (LLMs) have advanced the capability of game agents in social deduction games (SDGs). These games rely heavily on conversation-driven interactions and require agents to infer, make decisions, and express based on such information. While this progress leads to more sophisticated and strategic non-player characters (NPCs) in SDGs, there exists a need to control the proficiency of these agents. This control not only ensures that NPCs can adapt to varying difficulty levels during gameplay, but also provides insights into the safety and fairness of LLM agents. In this paper, we present DVM, a novel framework for developing controllable LLM agents for SDGs, and demonstrate its implementation on one of the most popular SDGs, Werewolf. DVM comprises three main components: Predictor, Decider, and Discussor. By integrating reinforcement learning with a win rate-constrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
