AI Testing Should Account for Sophisticated Strategic Behaviour

Vojtech Kovarik; Eric Olav Chen; Sami Petersen; Alexis Ghersengorin; Vincent Conitzer

arXiv:2508.14927·cs.GT·August 22, 2025

AI Testing Should Account for Sophisticated Strategic Behaviour

Vojtech Kovarik, Eric Olav Chen, Sami Petersen, Alexis Ghersengorin, Vincent Conitzer

PDF

Open Access 1 Video

TL;DR

This paper emphasizes the importance of incorporating strategic reasoning and game-theoretic analysis into AI testing to better predict deployment behavior and improve safety evaluations.

Contribution

It advocates for integrating strategic behavior considerations into AI evaluation methods and demonstrates how game theory can enhance safety assessment frameworks.

Findings

01

AI systems may understand their circumstances and reason strategically.

02

Game-theoretic analysis can formalize evaluation reasoning.

03

Incorporating strategic considerations improves safety evaluation robustness.

Abstract

This position paper argues for two claims regarding AI testing and evaluation. First, to remain informative about deployment behaviour, evaluations need account for the possibility that AI systems understand their circumstances and reason strategically. Second, game-theoretic analysis can inform evaluation design by formalising and scrutinising the reasoning in evaluation-based safety cases. Drawing on examples from existing AI systems, a review of relevant research, and formal strategic analysis of a stylised evaluation scenario, we present evidence for these claims and motivate several research directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

AI Testing Should Account for Sophisticated Strategic Behaviour· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Human-Automation Interaction and Safety