Building an NCAA mens basketball predictive model and quantifying its success
Michael J. Lopez, Gregory Matthews

TL;DR
This paper develops a predictive model for NCAA men's basketball tournament outcomes, evaluates its success, and quantifies the role of luck versus skill in winning predictions using simulation and contest data.
Contribution
It introduces a new prediction model based on binomial log-likelihood and assesses the impact of luck in tournament outcome predictions through simulation.
Findings
The winning Kaggle entry had at most a 12% chance of winning.
Luck plays a significant role in predicting tournament outcomes.
The model provides insights into the balance of skill and luck in sports predictions.
Abstract
The old adage says that it is better to be lucky than to be good, but when it comes to winning NCAA tournament pools, do you need to be both? This paper attempts to answer this question using data from the 2014 men's basketball tournament and more than 400 predictions of game outcomes submitted to a contest hosted by the website Kaggle. We begin by describing how we built a prediction model for men's basketball tournament outcomes under the binomial log-likelihood loss function. Next, under different sets of true underlying game probabilities, we simulate tournament outcomes and imputed pool standings, in an effort to determine how much of an entry's success can be attributed to luck. While one of our two submissions finished first in the Kaggle contest, we estimate that this winning entry had no more than about a 12% chance of doing so, even under the most optimistic of game…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Statistics Education and Methodologies · Data Analysis with R
