Predictive Modeling of Lower-Level English Club Soccer Using Crowd-Sourced Player Valuations
Josh Brown, Yutong Bu, Zachary Cheesman, Benjamin Orman, Iris Horng,, Samuel Thomas, Amanda Harsy, Adam Schultze

TL;DR
This study evaluates mathematical models for predicting outcomes across all levels of the English football pyramid using crowd-sourced player valuations, revealing challenges in forecasting lower leagues and questioning the predictive value of crowd-sourced data.
Contribution
It extends predictive modeling to lower-tier English football and other European leagues, incorporating crowd-sourced valuations, and critically examines their predictive effectiveness.
Findings
Lower leagues are harder to predict than top leagues.
Removing outliers equalizes predictability across leagues.
Crowd-sourced valuations may not reliably predict game outcomes.
Abstract
In this research, we examine the capabilities of different mathematical models to accurately predict various levels of the English football pyramid. Existing work has largely focused on top-level play in European leagues; however, our work analyzes teams throughout the entire English Football League system. We modeled team performance using weighted Colley and Massey ranking methods which incorporate player valuations from the widely-used website Transfermarkt to predict game outcomes. Our initial analysis found that lower leagues are more difficult to forecast in general. Yet, after removing dominant outlier teams from the analysis, we found that top leagues were just as difficult to predict as lower leagues. We also extended our findings using data from multiple German and Scottish leagues. Finally, we discuss reasons to doubt attributing Transfermarkt's predictive value to wisdom of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Sports, Gender, and Society
