Leveraging Minute-by-Minute Soccer Match Event Data to Adjust Team's Offensive Production for Game Context
Andrey Skripnikov, Ahmet Cemek, David Gillman

TL;DR
This paper develops a statistical model to adjust soccer offensive statistics based on game context, providing more accurate assessments of team performance by accounting for factors like score difference, game minute, and red cards.
Contribution
It introduces a count-response Generalized Additive Model that incorporates multiple contextual features to normalize offensive production across different game situations.
Findings
Adjusted offensive stats better reflect true team performance
Model reduces bias caused by game situation variability
Provides a standardized comparison metric for teams across matches
Abstract
In soccer, game context can result in skewing offensive statistics in ways that might misrepresent how well a team has played. For instance, in England's 1-2 loss to France in the 2022 FIFA World Cup quarterfinal, England attempted considerably more shots (16 to France's 8) and more corners (5 to 2), potentially suggesting they played better despite the loss. However, these statistics were largely accumulated when France was ahead and more willing to concede offensive initiative to England. To explore how game context influences offensive performance, we analyze minute-by-minute event-sequenced match data from 15 seasons across five major European leagues. Using count-response Generalized Additive Modeling, we consider features such as score and red card differential, home/away status, pre-match win probabilities, and game minute. Moreover, we leverage interaction terms to test several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
