A Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics
Blakeley B. McShane, Alexander Braunstein, James Piette, and Shane T., Jensen

TL;DR
This paper introduces a Bayesian hierarchical model to identify the most predictive offensive metrics in Major League Baseball, providing a reliable and comprehensive analysis of player performance indicators.
Contribution
It presents a novel Bayesian variable selection approach that automatically adjusts for multiple testing and estimates posterior distributions, improving the reliability of metric selection.
Findings
33 out of 50 metrics show significant signal
Metrics are highly correlated and linked to traditional performance aspects
The method outperforms alternative variable selection techniques
Abstract
Numerous statistics have been proposed for the measure of offensive ability in major league baseball. While some of these measures may offer moderate predictive power in certain situations, it is unclear which simple offensive metrics are the most reliable or consistent. We address this issue with a Bayesian hierarchical model for variable selection to capture which offensive metrics are most predictive within players across time. Our sophisticated methodology allows for full estimation of the posterior distributions for our parameters and automatically adjusts for multiple testing, providing a distinct advantage over alternative approaches. We implement our model on a set of 50 different offensive metrics and discuss our results in the context of comparison to other variable selection techniques. We find that 33/50 metrics demonstrate signal. However, these metrics are highly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
