Exploring chance in NCAA basketball
Albrecht Zimmermann

TL;DR
This paper investigates the role of chance in NCAA basketball outcomes by proposing a clustering-based method to better estimate the limits of predictability and assess model performance.
Contribution
It introduces a clustering approach using team profiles and scheduling data to derive realistic bounds on predictive accuracy, improving upon previous simplified models.
Findings
Simulated distributions align more closely with observed data.
Higher estimates of chance and more accurate limits on predictability.
Enhanced assessment of predictive model performance.
Abstract
There seems to be an upper limit to predicting the outcome of matches in (semi-)professional sports. Recent work has proposed that this is due to chance and attempts have been made to simulate the distribution of win percentages to identify the most likely proportion of matches decided by chance. We argue that the approach that has been chosen so far makes some simplifying assumptions that cause its result to be of limited practical value. Instead, we propose to use clustering of statistical team profiles and observed scheduling information to derive limits on the predictive accuracy for particular seasons, which can be used to assess the performance of predictive models on those seasons. We show that the resulting simulated distributions are much closer to the observed distributions and give higher assessments of chance and tighter limits on predictive accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Data Mining Algorithms and Applications · Statistics Education and Methodologies
