Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation

Mateo Pechon-Elkins; Jon Chun

arXiv:2603.10029·cs.GT·March 12, 2026

Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation

Mateo Pechon-Elkins, Jon Chun

PDF

Open Access

TL;DR

This paper introduces a game-theoretic framework based on quantal response equilibrium to evaluate large language models' strategic reasoning, providing calibrated measures and insights into their cognitive capabilities.

Contribution

It develops a novel QRE-based evaluation method for LLMs, deriving closed-form solutions, calibrating parameters against human data, and validating across extensive game datasets.

Findings

01

Bluff rates approach equilibrium within 4%.

02

QRE rationality parameters vary widely across models.

03

Capability profiles differ across cognitive axes.

Abstract

Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Speech and dialogue systems