Clustering of football players based on performance data and aggregated clustering validity indexes
Serhat Akhanli, Christian Hennig

TL;DR
This paper develops a method for clustering football players based on performance data using a custom dissimilarity measure and validation indexes, resulting in two types of clusters for team analysis and player similarity.
Contribution
It introduces a tailored clustering approach with validation criteria and expert-informed weighting, specifically designed for football player performance data.
Findings
Two distinct clusterings: major player groups and similar-player profiles
Validated clustering methods using multiple criteria and expert input
Enhanced understanding of player groupings for team analysis and scouting
Abstract
We analyse football (soccer) player performance data with mixed type variables from the 2014-15 season of eight European major leagues. We cluster these data based on a tailor-made dissimilarity measure. In order to decide between the many available clustering methods and to choose an appropriate number of clusters, we use the approach by Akhanli and Hennig (2020). This is based on several validation criteria that refer to different desirable characteristics of a clustering. These characteristics are chosen based on the aim of clustering, and this allows to define a suitable validation index as weighted average of calibrated individual indexes measuring the desirable features. We derive two different clusterings. The first one is a partition of the data set into major groups of essentially different players, which can be used for the analysis of a team's composition. The second one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Sports Performance and Training
