Combining Machine Learning and Human Experts to Predict Match Outcomes in Football: A Baseline Model
Ryan Beal, Stuart E. Middleton, Timothy J. Norman, Sarvapali D., Ramchurn

TL;DR
This paper introduces a benchmark dataset combining statistical match data and journalistic articles to predict football match outcomes, demonstrating improved accuracy over traditional methods.
Contribution
It provides a new dataset and baseline models that integrate machine learning with human expert insights for football outcome prediction.
Findings
Achieved 63.18% prediction accuracy
Boosted accuracy by 6.9% over traditional methods
Utilized both statistical data and journalistic articles
Abstract
In this paper, we present a new application-focused benchmark dataset and results from a set of baseline Natural Language Processing and Machine Learning models for prediction of match outcomes for games of football (soccer). By doing so we give a baseline for the prediction accuracy that can be achieved exploiting both statistical match data and contextual articles from human sports journalists. Our dataset is focuses on a representative time-period over 6 seasons of the English Premier League, and includes newspaper match previews from The Guardian. The models presented in this paper achieve an accuracy of 63.18% showing a 6.9% boost on the traditional statistical methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance
