Beyond Scaling: Assessing Strategic Reasoning and Rapid Decision-Making Capability of LLMs in Zero-sum Environments

Yang Li; Xing Chen; Yutao Liu; Gege Qi; Yanxian BI; Zizhe Wang; Yunjian Zhang; Yao Zhu

arXiv:2603.09337·cs.CV·March 11, 2026

Beyond Scaling: Assessing Strategic Reasoning and Rapid Decision-Making Capability of LLMs in Zero-sum Environments

Yang Li, Xing Chen, Yutao Liu, Gege Qi, Yanxian BI, Zizhe Wang, Yunjian Zhang, Yao Zhu

PDF

Open Access

TL;DR

This paper introduces the STAR Benchmark to evaluate large language models' strategic reasoning and decision-making in adversarial, time-sensitive environments, highlighting the importance of balancing reasoning depth and execution speed.

Contribution

It presents a novel multi-agent evaluation framework and a strategic assessment suite to analyze LLMs' performance in interactive, competitive scenarios with temporal constraints.

Findings

01

Reasoning models excel in turn-based settings but lag in real-time scenarios due to latency.

02

Faster instruction-tuned models outperform in real-time environments despite less reasoning depth.

03

The study emphasizes the importance of translating strategic plans into timely actions.

Abstract

Large Language Models (LLMs) have achieved strong performance on static reasoning benchmarks, yet their effectiveness as interactive agents operating in adversarial, time-sensitive environments remains poorly understood. Existing evaluations largely treat reasoning as a single-shot capability, overlooking the challenges of opponent-aware decision-making, temporal constraints, and execution under pressure. This paper introduces Strategic Tactical Agent Reasoning (STAR) Benchmark, a multi-agent evaluation framework that assesses LLMs through 1v1 zero-sum competitive interactions, framing reasoning as an iterative, adaptive decision-making process. STAR supports both turn-based and real-time settings, enabling controlled analysis of long-horizon strategic planning and fast-paced tactical execution within a unified environment. Built on a modular architecture with a standardized API and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Multi-Agent Systems and Negotiation