Do LLMs Know When to Flip a Coin? Strategic Randomization through Reasoning and Experience
Lingyu Yang (1) ((1) Shanghai Jiao Tong University)

TL;DR
This paper investigates whether large language models can strategically decide when to randomize in game scenarios, revealing their reasoning capabilities and limitations through a novel game inspired by the Tian Ji Horse Race.
Contribution
It introduces a new zero-sum game to evaluate LLMs' strategic randomization, analyzing their behavior across different prompt styles and model strengths.
Findings
Weaker models remain deterministic regardless of prompts.
Stronger models increase randomization with explicit hints.
Models adapt their strategies based on opponent strength and game context.
Abstract
Strategic randomization is a key principle in game theory, yet it remains underexplored in large language models (LLMs). Prior work often conflates the cognitive decision to randomize with the mechanical generation of randomness, leading to incomplete evaluations. To address this, we propose a novel zero-sum game inspired by the Tian Ji Horse Race, where the Nash equilibrium corresponds to a maximal entropy strategy. The game's complexity masks this property from untrained humans and underdeveloped LLMs. We evaluate five LLMs across prompt styles -- framed, neutral, and hinted -- using competitive multi-tournament gameplay with system-provided random choices, isolating the decision to randomize. Results show that weaker models remain deterministic regardless of prompts, while stronger models exhibit increased randomization under explicit hints. When facing weaker models, strong LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Platforms and Economics · FinTech, Crowdfunding, Digital Finance · ERP Systems Implementation and Impact
