Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games

Yidong He; Yutao Lai; Pengxu Yang; Jiarui Gan; Jiexin Wang; Yi Cai; Mengchen Zhao

arXiv:2605.04906·cs.AI·May 7, 2026

Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games

Yidong He, Yutao Lai, Pengxu Yang, Jiarui Gan, Jiexin Wang, Yi Cai, Mengchen Zhao

PDF

TL;DR

Strat-Reasoner enhances large language models' strategic reasoning in multi-agent games by integrating recursive reasoning, a centralized evaluation module, and group-relative reinforcement learning, leading to significant performance improvements.

Contribution

This work introduces a novel RL framework with recursive reasoning and centralized evaluation to improve LLMs' strategic reasoning in multi-agent settings.

Findings

01

Achieves 22.1% average performance improvement across multi-agent games.

02

Introduces a recursive reasoning paradigm integrating multiple agents' reasoning.

03

Employs a centralized Chain-of-Thought comparison module for reasoning quality evaluation.

Abstract

While Large Language Models (LLMs) excel in certain reasoning tasks, they struggle in multi-agent games where the final outcome depends on the joint strategies of all agents. In multi-agent games, the non-stationarity of other agents brings significant challenges on the evaluation of the reasoning process and the credit assignment over multiple reasoning steps. Existing single-agent reinforcement learning (RL) approaches and their multi-agent extensions fail to address these challenges as they do not incorporate other agents in the reasoning process. In this work, we propose Strat-Reasoner, a novel RL-based framework that improves LLMs' strategic reasoning ability in multi-agent games. We introduce a novel recursive reasoning paradigm where an agent's reasoning also integrates other agents' reasoning processes. To provide effective reward signals for the intermediate reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.