Applications of Improvements to the Pythagorean Won-Loss Expectation in Optimizing Rosters
Alexander F. Almeida, Kevin Dayaratna, Steven J. Miller, and Andrew K. Yang

TL;DR
This paper improves the Pythagorean win-loss expectation model by allowing different Weibull distribution parameters for runs scored and allowed, resulting in more accurate MLB season predictions through numerical methods.
Contribution
It extends Miller's theoretical model to account for different shape parameters, enhancing fit and predictive accuracy over previous models.
Findings
Better fit to MLB data from 1994-2023.
Higher accuracy in season win predictions.
Requires numerical integration instead of a closed-form expression.
Abstract
Bill James' Pythagorean formula has for decades done an excellent job estimating a baseball team's winning percentage from very little data: if the average runs scored and allowed are denoted respectively by and , there is some such that the winning percentage is approximately . One use case is to determine the value of potential signings to the team, as it allows us to estimate how many more wins one obtains over a season given an estimated change in run production and concession. We summarize earlier work on the subject, and extend the earlier theoretical model of Miller (who assumed the home and away teams' runs arise from independent Weibull distributions with the same shape parameter ; this has been observed to describe the observed run data well and yields a win probability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Sports Dynamics and Biomechanics · Statistics Education and Methodologies
