Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Zelai Xu; Wanjun Gu; Chao Yu; Yi Wu; Yu Wang

arXiv:2502.04686·cs.AI·June 19, 2025

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Zelai Xu, Wanjun Gu, Chao Yu, Yi Wu, Yu Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces LSPO, an iterative framework combining game theory and LLM fine-tuning to develop strategic language agents for the Werewolf game, significantly improving their reasoning and communication capabilities.

Contribution

The paper presents a novel Latent Space Policy Optimization method that effectively manages large language spaces and enhances strategic decision-making in language-based games.

Findings

01

LSPO agents outperform existing Werewolf agents.

02

Iterative strategy space expansion improves performance.

03

Combines game-theoretic optimization with LLM fine-tuning.

Abstract

Large language model (LLM) agents have recently demonstrated impressive capabilities in various domains like open-ended conversation and multi-step decision-making. However, it remains challenging for these agents to solve strategic language games, such as Werewolf, which demand both strategic decision-making and free-form language interactions. Existing LLM agents often suffer from intrinsic bias in their action distributions and limited exploration of the unbounded text action space, resulting in suboptimal performance. To address these challenges, we propose Latent Space Policy Optimization (LSPO), an iterative framework that combines game-theoretic methods with LLM fine-tuning to build strategic language agents. LSPO leverages the observation that while the language space is combinatorially large, the underlying strategy space is relatively compact. We first map free-form utterances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization· slideslive

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Language, Linguistics, Cultural Analysis