Random forest model identifies serve strength as a key predictor of tennis match outcome
Zijian Gao, Amanda Kowalczyk

TL;DR
This study uses machine learning on a large tennis dataset to predict match outcomes with over 80% accuracy, highlighting serve strength as a key predictor and aligning predictions with betting odds.
Contribution
It demonstrates that simple machine learning models can accurately predict tennis match results and identifies serve strength as a crucial factor, surpassing betting odds in predictive power.
Findings
Achieved over 80% prediction accuracy.
Serve strength identified as a key predictor.
Predictions align with betting odds.
Abstract
Tennis is a popular sport worldwide, boasting millions of fans and numerous national and international tournaments. Like many sports, tennis has benefitted from the popularity of rigorous record-keeping of game and player information, as well as the growth of machine learning methods for use in sports analytics. Of particular interest to bettors and betting companies alike is potential use of sports records to predict tennis match outcomes prior to match start. We compiled, cleaned, and used the largest database of tennis match information to date to predict match outcome using fairly simple machine learning methods. Using such methods allows for rapid fit and prediction times to readily incorporate new data and make real-time predictions. We were able to predict match outcomes with upwards of 80% accuracy, much greater than predictions using betting odds alone, and identify serve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Sports Performance and Training · Sports Dynamics and Biomechanics
