In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies
Lawrence D. Brown

TL;DR
This study evaluates empirical Bayes and hierarchical Bayes methods for predicting baseball players' future batting averages based on early-season data, validating predictions against actual season outcomes.
Contribution
It introduces a nonparametric empirical Bayes approach for in-season batting average prediction and compares its performance with traditional methods.
Findings
Nonparametric empirical Bayes performs well on full data.
Traditional methods perform better on homogeneous subsets.
Naive predictor performs worst among tested methods.
Abstract
Batting average is one of the principle performance measures for an individual baseball player. It is natural to statistically model this as a binomial-variable proportion, with a given (observed) number of qualifying attempts (called ``at-bats''), an observed number of successes (``hits'') distributed according to the binomial distribution, and with a true (but unknown) value of that represents the player's latent ability. This is a common data structure in many statistical applications; and so the methodological study here has implications for such a range of applications. We look at batting records for each Major League player over the course of a single season (2005). The primary focus is on using only the batting records from an earlier part of the season (e.g., the first 3 months) in order to estimate the batter's latent ability, , and consequently, also to predict…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
