Model quality in football: Quantifying the quality of an Expected Threat model
Koen van Arem, Jakob S\"ohl, Mirjam Bruinsma, Geurt Jongbloed

TL;DR
This paper develops a framework to quantify the quality of Expected Threat models in football, providing guidelines for validation and application based on theoretical analysis and simulations.
Contribution
It introduces a systematic method to assess Expected Threat model error and reliability, offering practical rules of thumb for model validation in football analytics.
Findings
Model error is approximately log-normally distributed for given training data.
Established error thresholds beyond which player evaluations become unreliable.
Provided practical guidelines for constructing and validating Expected Threat models.
Abstract
The recent growth in data availability in football has increased the risk of incorrect use of data-driven models, making guidelines on their validation and application necessary. The Expected Threat (xT) model is an accessible option for football organizations that start building in-house methods, yet little is known about how to assess its quality. The aim of this study is twofold: to examine how the model error depends on the number of game states and the number of training points, and to translate these results into guidelines for constructing and applying the model. Using the Markov chain underlying the model, we perform theoretical analyses and simulations to study the model error. These show that the model error is approximately log-normally distributed for a specified number of training points and game states. Additionally, we combine the simulations with expert consultation to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
