Same-Score Streaks: A Case Study in Probability Modeling
Peter Staab, Rick Cleary

TL;DR
This paper investigates the occurrence and probability of same-score streaks in MLB games from 1901 to 2019, comparing different models to understand streak distributions and simulate their likelihoods.
Contribution
It provides the first detailed analysis of same-score streaks in MLB, developing and comparing probability models to simulate streak occurrences over a century.
Findings
Streaks of length 2, 3, and 4 are quantified in historical data.
Different probability models are evaluated for their fit to streak data.
Simulations provide insights into the likelihood of various streak lengths.
Abstract
A same-score streak in sports is a sequence of games where the scores are equivalent in all games. The motivating problem arose from college basketball, however due to the difficulty in collecting data, streaks in Major League Baseball (MLB) were studied instead. This paper explores the historic data from regular-season games between 1901 and 2019 to include the likelihood of streaks of length 2, 3 and 4. Then we explore various probability models for the distribution of runs scored during MLB games and seasons and generate simulated statistics for the same length of streaks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Data Analysis with R · Statistics Education and Methodologies
