New inequality indicators for team ranking in multi-stage female professional cyclist races
Marcel Ausloos

TL;DR
This paper introduces new inequality indicators for ranking female cycling teams in multi-stage races, providing a methodology and numerical illustrations to analyze team strategies and competitiveness.
Contribution
It develops novel inequality measures and a methodology for their construction, applied to hierarchical team rankings in major female cycling races.
Findings
New indicators like 'leadership gap' and 'competition temperature' reveal team strategy differences.
Numerical illustrations on 2023 races demonstrate the indicators' effectiveness.
Analysis highlights the 'crucial core' of most competitive teams.
Abstract
Cycling competition is highly interesting since the team ranking is based on the best performance of some subset of team members. The paper develops new inequality indicators, a methodology to construct them, and numerical illustrations allowing to provide operative arguments in their favor. The numerical illustrations subsequently deal with hierarchical ranking indicators of (female) cyclist teams, competing in multi-stage races. For the illustration, the 2023 editions of the most famous long races for females are considered: 34th Giro d'Italia Donne, 2nd Tour de France Femmes, 9th Vuelta Femenina. Several classical ranking indicators are recalled and adapted to the study cases. The most usual indicator, , is based on the riders arriving time for the various stages, i.e., according to Union Cycliste Internationale (UCI) standard rules. One also uses another indicator, ,…
| notations | |||
|---|---|---|---|
| 8(*) | 8 | 7 | |
| 7 | 7 | 7 | |
| 24 (=15+9) | 22 (=15+7) | 23 (=12+11) | |
| distance () | 963.6 | 960.4 | 740.5 |
| 167 | 154 | 160 | |
| 133 | 123 | 127 | |
| 23 | 22 | 22 | |
| winning rider | vanVleuten | Vollering | vanVleuten |
| winning team | MOV | SDW | UAD |
| race dates | 06/30-07/09 | 07/23-07/30 | 05/01-05/07 |
| 2022 | UCI | Team | 2023 | 2023 | 2023 |
|---|---|---|---|---|---|
| rank | code | Sponsors | |||
| 1 | SDW | Team SD Worx | W | W | W |
| 2(∗) | TFS | Trek - Segafredo // Lidl - Trek | W | W | W |
| 3 | DSM | Team DSM - Firmenich | W | W | W |
| 4 | FST | FDJ - SUEZ | W | W | W |
| 5 | MOV | Movistar Team | W | W | W |
| 6 | CSR | Canyon-SRAM Racing | W | W | W |
| 7 | UAD | UAE Team ADQ | W | W | W |
| 9 | JAY | Team Jayco AlUla | W | W | W |
| 10 | JVW | Team Jumbo-Visma (TJV) | W | W | W |
| 11 | TIB | EF Education-TIBCO-SVB | W | W | W |
| 13 | LIV | Liv Racing TeqFind | W | W | W |
| 14 | WNT | CERATIZIT-WNT Pro Cycling | C | ||
| 15 | LPW | Lifeplus Wahoo | C | ||
| 19 | AGS | AG Insurance - Soudal Quick-Step | C | C | |
| 20 | HPU | Team Coop - Hitec Products | C | C | |
| 23 | AUB | St Michel - Mavic - Auber93 WE | C | C | |
| 24 | UXT | Uno-X Pro Cycling Team | W | W | |
| 25 | HPH | Human Powered Health | W | W | |
| 26 | BPK | BePink - GOLD | C | C | |
| 28 | COF | Cofidis Women Team | C | ||
| 30 | ARK | Arkéa Pro Cycling Team | C | ||
| 31 | MAT | Massi - Tactic Women Team | C | ||
| 33 | BDU | Bizkaia Durango | C | C | |
| 35 | TOP | Top Girls Fassa Bortolo | C | ||
| 36 | EIC | Eneicat - CMTeam - Seguros Deportivos | C | ||
| 39 | SWT | Sopela Women’s Team | C | ||
| 40 | VAI | Aromitalia - Basso Bikes - Vaiano | C | ||
| 42 | FBW | Farto-BTC Women’s Cycling Team | C | ||
| 51 | LKF | Laboral Kutxa Fundación Euskadi | C | ||
| 53 | SBT | Isolmant - Premac - Vittoria | C | ||
| 56 | STC | Soltec Team | C | ||
| 59 | BTW | Born To Win G20 Ambedo | C | ||
| 63 | CDR | Cantabria Deporte - Rio Miera | C | ||
| 64 | MDS | Team Mendelspeck | C | ||
| 112 | FED | Fenix-Deceuninck | W | W | |
| 145 | COG | Israel Premier Tech Roland | W | W | W |
| (∗∗) | GBJ | GB Junior Team Piemonte Pedale Castanese A.S.D. | C |
| 2023 | ||||||
|---|---|---|---|---|---|---|
| rank | team | team | team | |||
| 1 | MOV | 73:54:51 | MOV | 73:59:03 | DFP | 0:00:33 |
| 2 | FST | 73:55:37 | FST | 74:02:34 | SDW | 0:03:13 |
| 3 | LTK | 74:05:36 | DFP | 74:16:58 | VAI | 0:03:27 |
| 4 | DFP | 74:16:25 | SDW | 74:29:31 | BPK | 0:04:01 |
| 5 | CSR | 74:20:26 | UAD | 74:35:06 | MOV | 0:04:12 |
| 6 | UAD | 74:22:33 | LTK | 74:35:41 | TJV | 0:05:45 |
| 7 | SDW | 74:26:18 | CSR | 74:43:13 | FST | 0:06:57 |
| 8 | JAY | 74:32:19 | FED | 74:56:18 | COG | 0:07:46 |
| 9 | FED | 74:39:03 | TIB | 75:02:57 | TOP | 0:08:20 |
| 10 | TIB | 74:43:13 | JAY | 75:05:57 | HPH | 0:08:46 |
| 11 | TJV | 75:04:06 | TJV | 75:09:51 | GBJ | 0:09:06 |
| 12 | HPH | 75:08:40 | HPH | 75:17:26 | UAD | 0:12:33 |
| 13 | AGS | 75:22:19 | COG | 75:38:24 | SBT | 0:14:35 |
| 14 | LIV | 75:23:08 | AGS | 75:39:49 | FED | 0:17:15 |
| 15 | COG | 75:30:38 | LIV | 75:45:24 | AGS | 0:17:30 |
| 16 | UXT | 75:34:50 | UXT | 75:57:22 | TIB | 0:19:44 |
| 17 | TOP | 76:54:08 | TOP | 77:02:28 | MDS | 0:21:56 |
| 18 | BPK | 77:01:00 | BPK | 77:05:01 | LIV | 0:22:16 |
| 19 | MDS | 77:02:05 | MDS | 77:24:01 | UXT | 0:22:32 |
| 20 | SBT | 77:49:13 | SBT | 78:03:48 | CSR | 0:22:47 |
| 21 | VAI | 78:20:07 | VAI | 78:23:34 | LTK | 0:30:05 |
| 22 | BDU | 78:22:49 | GBJ | 79:18:17 | JAY | 0:33:38 |
| 23 | GBJ | 79:09:11 | BDU | 79:30:07 | BDU | 1:07:18 |
| 24 | BTW | … | BTW | … | BTW | … |
| 2023 | ||||||
|---|---|---|---|---|---|---|
| rank | team | team | team | |||
| 1 | SDW | 76:17:38 | SDW | 76:26:34 | ARK | 0:02:12 |
| 2 | CSR | 76:29:43 | MOV | 76:41:25 | UAD | 0:04:15 |
| 3 | MOV | 76:35:41 | CSR | 76:46:20 | JAY | 0:05:34 |
| 4 | FST | 76:37:18 | UAD | 76:53:19 | COF | 0:05:35 |
| 5 | UAD | 76:49:04 | FST | 76:54:55 | COG | 0:05:37 |
| 6 | AGS | 76:53:31 | AGS | 77:01:21 | MOV | 0:05:44 |
| 7 | TJV | 77:02:04 | TJV | 77:18:35 | HPH | 0:05:53 |
| 8 | LTK | 77:05:09 | COG | 77:20:44 | AGS | 0:07:50 |
| 9 | DFP | 77:07:58 | DFP | 77:24:11 | WNT | 0:08:26 |
| 10 | COG | 77:15:07 | LTK | 77:33:18 | SDW | 0:08:56 |
| 11 | FED | 77:27:11 | WNT | 77:37:40 | AUB | 0:13:51 |
| 12 | LIV | 77:27:47 | AUB | 77:55:05 | DFP | 0:16:13 |
| 13 | WNT | 77:29:14 | FED | 77:56:01 | TJV | 0:16:31 |
| 14 | TIB | 77:39:34 | COF | 77:57:04 | CSR | 0:16:37 |
| 15 | AUB | 77:41:14 | HPH | 78:00:33 | FST | 0:17:37 |
| 16 | COF | 77:51:29 | JAY | 78:06:15 | LPW | 0:20:49 |
| 17 | HPH | 77:54:40 | ARK | 78:17:01 | LTK | 0:28:09 |
| 18 | LPW | 77:58:42 | LIV | 78:17:50 | FED | 0:28:50 |
| 19 | JAY | 78:00:41 | LPW | 78:19:31 | HPU | 0:38:39 |
| 20 | UXT | 78:08:20 | TIB | 78:32:07 | LIV | 0:50:03 |
| 21 | ARK | 78:14:49 | UXT | 78:58:52 | UXT | 0:50:32 |
| 22 | HPU | 79:25:34 | HPU | 80:04:13 | TIB | 0:52:33 |
| 2023 | ||||||
|---|---|---|---|---|---|---|
| rank | team | team | team | |||
| 1 | UAD | 56:39:07 | FST | 57:27:16 | TJV | 0:36:06 |
| 2 | FST | 56:45:07 | UAD | 57:29:18 | SDW | 0:36:39 |
| 3 | CSR | 56:46:21 | SDW | 57:33:51 | TFS | 0:37:11 |
| 4 | SDW | 56:57:12 | CSR | 57:33:55 | AUB | 0:38:04 |
| 5 | MOV | 57:04:05 | TJV | 57:45:23 | TIB | 0:38:21 |
| 6 | TJV | 57:09:17 | MOV | 57:46:37 | FST | 0:42:09 |
| 7 | DSM | 57:10:52 | TFS | 57:53:19 | COG | 0:42:26 |
| 8 | TFS | 57:16:08 | DSM | 57:53:29 | LIV | 0:42:31 |
| 9 | JAY | 57:32:00 | JAY | 58:15:12 | MOV | 0:42:32 |
| 10 | COG | 57:36:22 | COG | 58:18:48 | DSM | 0:42:37 |
| 11 | LIV | 57:43:19 | LIV | 58:25:50 | JAY | 0:43:12 |
| 12 | LKF | 57:50:32 | TIB | 58:33:59 | EIC | 0:46:11 |
| 13 | TIB | 57:55:38 | AUB | 58:53:35 | CSR | 0:47:34 |
| 14 | EIC | 58:07:34 | EIC | 58:53:45 | UAD | 0:50:11 |
| 15 | AUB | 58:15:31 | LKF | 59:06:29 | CDR | 0:51:51 |
| 16 | BDU | 58:31:27 | BDU | 59:40:22 | FBW | 0:52:25 |
| 17 | MAT | 59:05:37 | MAT | 60:00:16 | MAT | 0:54:39 |
| 18 | BPK | 59:15:03 | BPK | 60:17:18 | STC | 1:01:27 |
| 19 | HPU | 59:24:51 | HPU | 60:42:30 | BPK | 1:02:15 |
| 20 | FBW | 60:10:45 | FBW | 61:03:10 | BDU | 1:08:55 |
| 21 | CDR | 60:27:06 | CDR | 61:18:57 | LKF | 1:15:57 |
| 22 | STC | 61:17:50 | STC | 62:19:17 | HPU | 1:17:39 |
| 23 | SWT | … | SWT | … | SWT | … |
| Min. | 73:54:51 | 73:59:03 | 0:00:33 | 76:17:38 | 76:26:34 | 0:02:12 | 56:39:07 | 57:27:16 | 0:36:06 |
| Max. | 79:09:11 | 79:30:07 | 1:07:18 | 79:25:34 | 80:04:13 | 0:52:33 | 61:17:50 | 62:19:17 | 1:17:39 |
| 75:39:04 | 75:54:54 | 0:15:50 | 77:26:01 | 77:44:41 | 0:18:39 | 58:08:16 | 58:57:51 | 0:49:35 | |
| 75:38:07 | 75:53:53 | 0:10:37 | 77:25:50 | 77:44:25 | 0:12:58 | 58:07:26 | 58:56:53 | 0:48:15 | |
| Med. | 75:08:40 | 75:17:26 | 0:12:33 | 77:27:29 | 77:46:23 | 0:15:02 | 57:46:56 | 58:29:55 | 0:44:42 |
| 1:35:42 | 1:39:20 | 0:14:27 | 0:43:14 | 0:50:44 | 0:16:04 | 1:18:06 | 1:25:21 | 0:12:29 | |
| Skewn. | 0.84232 | 0.88415 | 2.03400 | 0.70337 | 0.76641 | 1.06135 | 0.93943 | 0.84955 | 1.00163 |
| Kurt. | -0.59051 | -0.38866 | 5.00210 | 0.77460 | 0.77392 | -0.16262 | -0.06882 | -0.37660 | -0.07262 |
| 0.31895 | 0.31895 | 0.35652 | 0.32352 | 0.32352 | 0.36126 | 0.32354 | 0.32354 | 0.32652 | |
| 3.13528 | 3.13527 | 2.80487 | 3.09100 | 3.09099 | 2.76806 | 3.09080 | 3.09077 | 3.06258 | |
| 1.05e-04 | 1.13e-04 | 0.16414 | 2.06e-05 | 2.81e-05 | 0.15940 | 1.18e-04 | 1.38e-04 | 0.01386 | |
| 0.02062 | 0.02133 | 0.89212 | 0.00909 | 0.01062 | 0.84168 | 0.02187 | 0.02357 | 0.24586 | |
| 0.04350 | 0.04350 | 0.07808 | 0.04546 | 0.04546 | 0.07766 | 0.04548 | 0.04548 | 0.04820 | |
| 1.93e-05 | 2.07e-05 | 0.03618 | 3.94e-06 | 5.38e-06 | 0.03373 | 2.28e-05 | 2.65e-05 | 0.00288 | |
| 0.01120 | 0.01160 | 0.43649 | 0.00500 | 0.00582 | 0.44368 | 0.01192 | 0.01291 | 0.13211 | |
| 2.11e-04 | 2.26e-04 | 0.33062 | 4.12e-05 | 5.63e-05 | 0.32298 | 2.38e-04 | 2.76e-04 | 0.02846 | |
| 0.00867 | 0.00883 | 0.32024 | 0.00359 | 0.00422 | 0.33864 | 0.00889 | 0.00984 | 0.09997 | |
| 0.04397 | 0.04399 | 0.07716 | 0.04568 | 0.04572 | 0.08171 | 0.04600 | 0.04605 | 0.05237 | |
| . | 11 | 231 | 0.91304 | 11 | 4.0196 | 1.7011 | |
| 118 | 17 | 0.067194 | 118 | 36.1731 | 12.6254 | ||
| . | 107 | 39 | 0.15415 | 107 | 32.6804 | 11.2680 | |
| . | 22 | 187 | 0.80952 | 22 | 6.5734 | 2.2169 | |
| 110 | 11 | 0.047619 | 110 | 36.1289 | 13.3487 | ||
| . | 88 | 60 | 0.23810 | 88 | 30.3798 | 11.7866 | |
| . | 8 | 215 | 0.93074 | 8 | 3.5031 | 1.9793 | |
| 74 | 83 | 0.35931 | 74 | 25.6371 | 10.2048 | ||
| . | 66 | 99 | 0.42857 | 66 | 22.8145 | 9.1486 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
New inequality indicators for team ranking in multi-stage female professional cyclist races
**Marcel Ausloos 1,2,3,4
1 School of Business, University of Leicester, Brookfield,
Leicester, LE2 1RQ, UK
2 Group of Researchers Applying Physics in Economy and Sociology
(GRAPES), Beauvallon, rue de la Belle Jardinière, 483/0021
Sart Tilman, Angleur, B-4031, Liège, Belgium
e-mail: [email protected]
3 Department of Statistics and Econometrics,
The Bucharest University of Economic Studies,
Caeia Dorobantilor 15-17, 010552 Bucharest, Romania
4 Department of Statistics, Predictions and Mathematics,
Universitatea Babeş-Bolyai,
Str. Mihail Kogălniceanu 1, 400084, Cluj-Napoca, Romania
**
Abstract
Cycling competition is highly interesting since the team ranking is based on the best performance of some subset of team members. The paper develops new inequality indicators, a methodology to construct them, and numerical illustrations allowing to provide operative arguments in their favor. The numerical illustrations subsequently deal with hierarchical ranking indicators of (female) cyclist teams, competing in multi-stage races. For the illustration, the 2023 editions of the most famous long races for females are considered: 34th Giro d’Italia Donne, 2nd Tour de France Femmes, 9th Vuelta Femenina.
Several classical ranking indicators are recalled and adapted to the study cases. The most usual indicator, , is based on the riders arriving time for the various stages, i.e., according to Union Cycliste Internationale (UCI) standard rules. One also uses another indicator, , which requires that the riders finish the race, whence each stage, in order to define the race best team. Another contribution of the paper derives from specific developments of these indicators, thereby leading to new measures: the “leadership gap” based on , and the “competition temperature”, based on entropy. It is argued that the numerical values point to differences in team strategy based on rider skill levels. The ranking of contributions to indicators allow to observe the ”crucial core” made of the most competitive teams.
Keywords:
dynamics of social systems; entropy; hierarchy selection; inequality indicators; multi-stage cycling races; team ranking;
1 Introduction
Rules have to be devised for providing a realistic hierarchy of choices.
Yet, the ranking methodology can lead to much debate, among many others in social choice considerations, as in the case of tournament ranking methods. In particular, sport activities seem to provide rather objective and quantitatively reliable data for academic studies.
Among these, it appears that cyclist races contain much interesting data. Indeed, one can focus on the role of individuals within a team, since team hierarchy is based on the best performance of a subset of members of the team. Within this framework, incentive must be provided to teams and team members for showing some interesting race competition. Thus, relevant hierarchy indices are needed, - and somewhat tied to money awards.
In the following, one considers the three most famous female races within the Union Cycliste Internationale (UCI) classifications: Giro d’Italia Donne, Tour de France Femmes, Vuelta Femenina.
According to UCI rules, the hierarchy of the teams, at the end of an -long multi-stage (professional) cyclist race, depends on the cumulative time ( ) of the team’s 3 fastest riders for every stage, - not taking into account time bonuses or penalties. These times are relevant for the rider standing, but should not be taken into account for the team rank. However, for the final team ranking at the end of the race, UCI team hierarchy does not even care if such riders, relevant for some stages, do finish the whole race. This highly debatable measure has been discussed elsewhere as paradoxical - and proved to be highly biased.
Thus, one may introduce an “adjusted team final time” measure, , based on the (3 fastest) riders of a team who have finished the whole race. In so doing, one avoids possible Cipollini effect, - when riders are specifically selected for the few first usually easy stages, as sprinters, but are withdrawn thereafter, yet globally contributing, even though absent, to the overall team time classification.
Going beyond the above consideration, one may derive metrics that aim at measuring some team skill and also at attempting to quantify team global strategy, for a given race. For so doing, one proposes two new measures or indicators: (i) the “leadership gap index”, (ii) the “race temperature index”. The ranking of teams according to such indicators allow to observe the “crucial core” made of the most competitive teams.
In brief, these two so newly defined metrics complement the entropy approach and hopefully develop previous works toward team management and coaching applications.
For completeness, let it be observed that this paper enters the framework of studies on cycling published in the International Journal of Sports Science & Coaching and other journals. Notice that most works pertain to the (physiological) characteristics of the cyclists. Closer to the present aim, O’Grady et al. discuss, after interviews, tactical strategies that professional riders and coaches prepare at training time for application in races.
In Section 2, one poses the Research Questions and mentions the Data sources. One displays the fundamental characteristics of the races.
In Section 3, one introduces the methodology, including the formulae for and . Next, one explains that (i) the “leadership gap index”, in Section 3.1, is based on []; (ii) the “race temperature index” is defined through Shannon-Boltzmann-Gibbs entropy concepts, in Section 3.2. It is argued why these indices are so called.
In Section 4, other hierarchy measuring indices are considered for readily comparison, i.e., some qualitative advantages and disadvantages of the newly proposed indicators. There exist fundamentally different approaches in ranking methodology. It is pertinently emphasised that changing the ranking rules, in a multi-stage race, may change some tournament metrics; see, for example, scoring and ranking simulation by Csató.
Here, two inequality indicators can be directly derived from the distribution characteristics in order to evaluate the dispersion of team “values”: the Atkinson index, and the Coefficient of Variation. Three classical indicators of dispersion can be next considered: the Herfindahl-Hirschman index, the Gini coefficient, and the Theil index. These indicators show how dispersed the final times are, but are calculated without taking into account the ranking of teams. One may delve more into the hierarchy problem if one ranks the components. Moreover, one can calculate other ad hoc indicators: the Pietra-Hoover index, and the Rosenbluth coefficient.
Section 5 contains numerical results and some analysis. In Section 6, one deepens into team hierarchy, comparing teams in the various races. Conclusions follow in Section 7, together with suggestions for further research due to obvious limitations of the present study.
2 Research Questions
Due to the considerations outlined in the Introduction section, i.e., UCI unjustified shocking biased constraints on usual team value measures, one can select the following research questions as a guiding thread of the paper:
- •
can one provide indicators with less biased constraints on team ranking?
- •
are they strategic or coaching features which arise in studying and measuring team “competition” hierarchy, in cyclist races?
For finding proper answers geared toward various disciplines but based on case studies the following top multistage races with complex data, are hereby used:
- •
34th Giro d’Italia Donne;
- •
2nd Tour de France Femmes;
- •
9th Vuelta Femenina.
For the present exposé only a single year is examined: 2023.
For simplicity, the races will be called , , . A few fundamental characteristics for the 3 races are found in Table 1: dates of races, length, number of stages, number of riders and of teams, etc. The list of competing teams and their UCI code are given in Table 2, also emphasizing the team level according to UCI for those participating in a specific race: refers to Women’s World Teams; to UCI Women’s Continental Teams.
One can freely obtain relevant data from the organizers websites. However, they are not all provided in a consistent way. Therefore, it is best to rely on professional websites, e.g., . Nevertheless, data cross-checking must be systematically done; one chosen method has been to use pages. Disagreements are still found; they have been manually resolved.
3 Methodology
One should recall that in a -long multi-stage cyclist race, the UCI rules imply that the winning team is discovered as the team having the lowest sum of the cumulative times for the 3 fastest riders of the team for each stage, i.e., the lowest
[TABLE]
when the team (finishing) time for stage is
[TABLE]
where is the finishing time of one of the 3 fastest riders () of team for that stage.
It is re-emphasized that the fact that such riders do not necessarily finish the -long race appears to be irrelevant for UCI. But, one may rightly wonder thereafter whether the sum in Eq.(1) points out to the “best team”. It seems that one should consider the adjusted team final time such that
[TABLE]
where refers to the 3 fastest riders having completed all stages for team . Let it be emphasized that these 3 “” riders might be quite different from the 3 “” riders having contributed to any , whence to .
One complexity has to be emphasized: according to UCI rules, the final time of a , at the end of a stage, whence at the end of the race, takes into account bonuses (and penalties); thus the truly finishing time of a rider is apparently equal to the sum of “the reported final time + bonus - penalties”. However, UCI rules disregard such extra time measures in order to calculate the time on a stage, whence for the whole race. Thus, the same “restriction” has been used for calculating the team final time.
Therefore, the methodological path goes as follows:
- •
get each team , according to organizers published data.
- •
rank riders in each team according to their true finishing race time, i.e., excluding bonuses and penalties (if they exist);
- •
select the 3 fastest riders overall in each team, for each daily stage, and add their cumulative times to get .
One obtains the values and hierarchy displayed in Tables 3 - 5, in increasing time order. The statistical characteristics of the relevant finishing times distributions are reported in Table 6.
Together with Table 1, Table 6 a posteriori allows to compare race difficulties. It is easily observed that the time distributions of and are similar for and , both races taking much more time than , since indeed they are longer. In the 3 cases, the skewness is positive , indicating a long tail in the final time distributions for the slowest teams. The negative kurtosis for and indicate a flatter distribution, whence a race where teams find a more balanced competition, in contrast to the which presents a peaked distribution at the mean, - itself close to the median. The same deduction holds when observing the , much shorter in , indicating a fiercer competition between the top teams.
3.1 Leadership Gap index
Since the measures and indicate a different team hierarchy, one can consider that their relative value:
[TABLE]
which measures a behavioral difference between teams and/or riders performance which might be due to team members skills or to coaching strategy.
The values of are given in Tables 3 - 5. The smallest should correspond to the cases in which the 3 fastest riders in each stage remain so throughout the whole competition, and finish it. A large value in contrast indicates a team emphasis on a distributive role according to riders skills.
Indeed, the indicator reaches a large value if the riders are not much concerned by their final rank. In this case, the “team leaders” do not seem to be “pre-defined”. Thus, appears to be a measure of a specific rider leadership definition in a team; in other words, the value of measures a “gap” between team strategies, - depending on riders skills and coaches mandates.
3.2 Stage and Race Temperature index
Shannon and Boltzmann-Gibbs entropy are analogous measures of disorder in informatics and thermodynamics. The maximum entropy value corresponds to a state of maximum uncertainty, i.e., when all outcomes are equally likely, pointing to a lack of structure or even predictability because of the absence of disorder. Thus, the concept seems of interest for measuring some operational effect in sport results.
Let be the relative time measure ( “contribution”) of the (3 fastest riders of a) team () to the total (cumulative) time that was needed by the 3 fastest riders of each team among all () teams in competition in order to finish the stage
[TABLE]
is defined in terms of , the finishing time of one of the 3 fastest riders () of team (or , in terms of UCI codes) for that stage; see Eq. (2). This measure can be taken per definition like a probability of a team best finishing time among others. Thus, can serve as a characteristics how rare the occurrence of such an outcome is. The stochastic Shannon entropy reads
[TABLE]
The average of such a over the total probability distribution leads to the Shannon information entropy;
[TABLE]
The whole race entropy derives from summing over all the teams entropy: .
Thereafter, reconnecting the Shannon information entropy to the thermodynamic Boltzmann-Gibbs entropy, one can define a team dependent (“generalized”) temperature during the stage as
[TABLE]
Mutatis mutandis, this is analogous to the “temperature of financial markets”. In fact, one may propose a (-th) “stage temperature index” as
[TABLE]
the smaller it is, the cooler appears to be the competition during the stage. Indeed, recall that the Shannon entropy of a uniform distribution, i.e, if all are equal, thus in the absence of disorder, is the maximum entropy value which can occur. Randomness or disorder, in ’s, thereby corresponds to a high “(behavioral) temperature”, here seen as an intense competition.
The “team temperature” is forecasted to be higher if the riders have much strategic freedom. It is readily expected that such a temperature is lower if leaders are well defined. In the present case, this occurs, as easily understood, if is small.
The overall race temperature is of course
[TABLE]
where the latter has been calculated from the time of riders irrespective of the fact that they might not have finished the whole race; thus, more exactly, one should have written the rather as . Recall that instead of in Eq.(5), one can consider that the pertinent time is that of riders finishing the whole race. Following the above path, one would obtain a final race temperature . Moreover, another temperature can be derived from data, leading to some . Due to the nonlinear data transformations, , - in contrast to the Entropy which has an additivity property.
Again, one can justify the semantic validity for calling a relative temperature index. Indeed, appears to be a measure of the distribution of riders kinetic energy at the end of a race (or stage).
4 Other Indicators
4.1 Atkinson index
Thus, the Atkinson index can be used for evaluating the strength dispersion of teams in a race. Per definition,
[TABLE]
where is the geometric mean and is the arithmetic mean of the distribution, - reported in Table 6. The Atkinson index has previously been used in sport in order to measuring competitive balance with an application to English Premier League football. One can obviously apply the notion to the present cases. It could also be considered for daily stages () rather than the whole race (), but such an application is left for further work.
4.2 Coefficient of Variation
Among the indicators using statistical characteristics of distributions, the Coefficient of Variation () measures the data relative dispersion, i.e., pointing to the dispersion () around the mean () of the distribution; thus, expressed in percentage it is somewhat hinting to inequalities. Per definition, one has
[TABLE]
easily obtained from the distributions characteristics.
4.3 Herfindahl-Hirschman index
One may also recall the Herfindahl-Hirschman index serving to measure the “amount of competition” between economic entities,
- or for our examples, between teams.
The Herfindahl index, also known as Herfindahl-Hirschman index (), is a measure of “concentration in a market”. Formally, in obvious notations, it reads
[TABLE]
where is some economic measure, like a company size, or its share (thus, a concentration) in a market. Thus, a index indicates a highly competitive market between firms. From a portfolio point of view, a low index implies a very diversified portfolio; a high concentration demands ; a low concentration ; ranges between and 1.
Adapted to the case of sport team ranking, can be considered as the finishing time () of a team (), i.e., leading to , as defined in Eq. (5), and where the relevant team times are selected depending on the chosen or scheme, or even .
As an extreme example, - which sometimes occurs, if 3 riders of each team arrive together, whence have the same finishing time for a stage, all terms in the Eq.(13) sum are equal, whence , pointing to uniformity or in other words to a rather weakly competitive race. In other words, an increase in represents a decrease in competitive balance.
One sometimes says that the “number of effectively important competitors” is the inverse of the Herfindahl index.
A normalized is sometimes used in order to attempt some universal definition:
[TABLE]
with the appropriate ; it ranges between 0 and 1.
4.4 Gini Coefficient
The most popular way for quantifying inequality levels, in socio-economic systems, is through the Gini coefficient (). It reads
[TABLE]
where the -th item has a measure , and is the average value of this quantity over the whole set of elements. In the present case can be the resulting time () of a team () due to 3 riders as above within the or schemes. The Gini coefficient should be equal to 0 if all teams are equivalent but, e.g., = 1 if one team is much above others, or in socio-economic terms, which would be “monopolizing the whole of the available resources”. One should expect a if the competition has no winner, - in other words if teams have equivalent final “values”.
4.5 Theil index
For completeness, one can define the “final” Theil index. One has
[TABLE]
summing over the different (finishing) teams in the race and where is the mean value, of any variable, which here can be any . This transformation induces negative and positive values of the (log-transformed) data, depending on the ratio x_{k}/<x_{k}>$$\equiv . Whence, can be very small.
4.6 Pietra-Hoover index
It seems of interest, for emphasizing the structure, like the maximum position and the corresponding percentage of the relevant population, to display the data as the difference between the Lorenz curve () and the line of perfect equality. One has
[TABLE]
with . In fact, this is the Pietra-Hoover (inequality) index.
[TABLE]
It indicates how the variable values should be (re)distributed in order for them to create a perfect equality in times. High values of the index obviously represent a high inequality level since a greater redistribution of values is required in order to achieve equality; vice-versa, lower values of the index represent a lower inequality level.
4.7 Rosenbluth Coefficient
The Rosenbluth Coefficient is defined as
[TABLE]
where the symbol usually indicates a firm’s rank position on economic markets. Thereafter, can be taken as the rank of the percentage of a size measure, like some ratio.
Practically, the Rosenbluth index assigns more weight to weaker competitors. Such a measure which weights each competitor by its rank rather than by its “share” seems very appealing for our purpose.
The Rosenbluth coefficient is related to the Gini coefficient through
[TABLE]
5 Results and Analysis
Numerical results should be examined along two perspectives: (i) one takes into account new indices based on imposing the constraint that a team evaluation and ranking depends on the members at the valuation time (here at the end of the race), but (ii) besides global statistical values, i.e., irrespective of the team rank, one distinguishes values taking into account team ranks, as a weight. The global values are found in Table 7, in the top and bottom respectively. Most of the outputs arise from freely accessing .
One can remark that the orders of magnitude for and do not differ much, but these differ from . The entropy and the leadership gap temperature significantly differ from race to race. This can be tracked to the similar order of magnitude of the , implying similarities in . This indicates that the overall distribution of rankings has not much influence on global characteristics; in fact, one should expect that the hierarchies do not much differ from race to race. Concerning the rank effect as a weight for calculating indicators, one observes the largest effects in the “unweighted” indicators. This suggests to provide displays based on ranks.
The first display of interest should be the new indicator variation as a function of the rank for the different races. Fig. 1 shows a plot of team final times ranked in time increasing order, according to UCI rules. Once and for all notice the meaning of colors: they correspond to the Giro d’Italia (, green triangles), Tour de France (, blue circles), and Vuelta a España (, red triangles). For , the (best OLS) fit leads to some smooth exponential behavior. In contrast, a simple curve cannot fit the and data in which steps appear, suggesting team clustering, as discussed below.
For the best ranked teams (), values for and are very similar. However the values are very different from those in the other two races, pointing to either different difficulties or/and to different types of skills of teams members, and subsequently strategies, as hinted in the previous Sections.
Other new indicators imply the rank distribution. Fig. 2 presents the plot of , a term appearing in calculating , for the distribution, in ranked time increasing order. A marked difference is found between data for and or . The evolution is nevertheless rather similar, presenting 3 inflexion points. The dashed line is a 4-th order polynomial, used as a guide to the eye only, for distinguishing the rank dependence of the indicator in the 3 races.
Similarly, one can study the contribution of each team to the Theil index through ,
Fig. 3 shows the plot of the values for the distribution, in ranked time increasing order. The dashed line is a 4-th order polynomial, as a guide to the eye only, pointing the crucial (core) rank () at the minimum of the s distribution. Indeed, a marked “difference” occurs for teams below , in the 3 races. A similar behavioral aspect is found for the variation of the s, - not shown for saving space.
The most classical indicator of inequalities is the Gini coefficient; it is often presented as resulting from the ratio of surfaces, - somewhat difficult to estimate at first sight. In order to provide a better vizualisation, one displays the evolution of the distance to the equality distribution line as a function of the rank: Fig. 4 displays for the distribution of and times only for the 2023 finishing teams; on Fig. 5, one finds the distance for the distribution of and times distributions for the 2023 and female races, respectively.
Fig. 6 displays the distance between the Lorenz curve and the line of perfect equality, for the distribution of relative times. The maximum of each curve gives the crucial (core) rank, since =1 corresponds to the highest rank, = 23 for and = 22 for and . It is remarkable that the maximum of each curve occurs near the crucial (core) rank in all cases.
One may propose some interpretation of a team core existence by analogy with an anharmonic oscillator. A few teams, the main ones, have definite goals and aims, with ad hoc team composition but the other teams anticipate or respond to the strategy of the leading teams which have well defined and expectedly well performing leaders. Beside the rider skills differences, whence anticipating different levels of performance, one can also imagine that the not-too-best teams response introduces some non-linearity in the overall race dynamics description.
In all these Figures, in particular in Figs. 2 and 6, it might be noticed that the data points present some different behavior than those for and . Such different behaviors can be on one hand traced back to quite different time distributions: observe the orders of magnitudes of and , and a fortiori , , and , even taking into account the length difference of the races. On the other hand, this behavior likely reflects the difference in team approaches for the races, i.e., the team compositions and the teams levels (see Table 2), leading to time distributions differences.
Fig. 7 is a plot of vs. teams final time, according to UCI rule. N.B. The - and - data have been rescaled, (divided and multiplied, respectively) by a factor 1.25, in order to display the data on the same figure. Such a scaling factor () roughly corresponds to the ratio between the lengths of the relevant races; see Table 1. A very similar figure can be made for illustrating the and relationship, without bringing anything specifically more interesting; it is omitted for space saving.
In the same line of thought, one could compare the differences in team ranks in and with respect to each other, besides discussing both vectors with respect to . This can be made along the Kendall coefficient or measure their relative Kemeny distance. These considerations are left for Appendix A, in order to refer to the “weighted preferences” notions, somewhat of wider interest than the above measures.
Observing the Tables and the Figures, one is rightly tempted to look for team clustering, within the perspective of this study.
(i) In Tables 3-5, one can observe clusters, admitting rank swapping:
- •
: the best 2 teams are swapped in and ; but the internal ranking is much scrambled between the 3-rd and 8-th team; the following 15 teams are equally ranked, except the last 2;
- •
: scrambling of the best 2nd and 5-th teams; much scrambling thereafter, but with mere small swapping, in the center of the ranks, up to the last ranks;
- •
: the first 8 teams swap ranks regularly by pairs; with almost no scrambling thereafter.
(ii) In Fig. 1 and Fig. 7, one can observe clusters of teams for
- •
: a group of 16 below 3:04:30 and a group of 8 above; Fig. 7;
- •
: more precisely, below and above the rank for ;
- •
but see also a step at separating 2 clusters; Fig. 1;
- •
: below and others below and above ; Fig. 1;
- •
: a cluster of 16 teams centered on ; Fig. 7;
- •
: a cluster of 4 teams appears above 00:31 from values; Fig. 7;
- •
: 5 clusters seem to appear below , , and for , on Fig. 1;
- •
: a cluster of 14 teams, around [] [], Fig. 7.
These observations remind of self-organized complex systems, often amounting to 3 clusters (high, medium, low ranks/classes) under collaboration-competition rules as found in many societies.
6 Discussion
Before discussing features, let it be recalled that one looks for (new) indicators containing (new) filters, in particular for stressing the contribution input of team members to a team rank, - due to manager selections and strategies. Thus, one introduces the “leadership gap” , Eq. (4), and the “race temperature” , Eq. (10), - beside classical ranking indices. From the numerical values of interest in a set of study cases, one expects to deduce qualitative aspects of wider insights. Indeed, the classically used indicators (Section 4.1 - 4.7) provide hypotheses (or assumptions) to managers devising strategies. The new indicators (Section 3.1 - 3.2) increase perspectives.
From all Figures and Tables, one can notice that the (new) indicators reflect different contents: the former makes more precise the role of team members through their entire participation in the hierarchical procedure, while the latter emphasizes the strength of the competition leading to the final ranking.
An additional contribution stems from the weight given by the ranking to the indicators, when displaying them as a function of the team ranks (Figs. 1 - 6).
In fact, one observes the existence of clusters of teams, more explicitly in Fig. 7 as in many examples of socio-economics populations: a high, a medium and a low class of teams. Further investigation might be pursued through a recently proposed cluster stability indicator, Unit Relevance Index (URI).
Let us recall that (alas) it is very difficult to move from one class to another, except through the introduction of external fields, - most likely money as the incentive. Inequalities and concentrations are inevitable but one can observe how extreme cases are concerned. As examples, one has noticed that the two best teams may interchange their rank according to the filtering, see Tables 3-5. Same for the worst teams. In both cases, very generally, the matter is very relevant for team management and race strategies. However, in order to maintain some form of competition, one has not to neglect the “middle class” as the needed ballast. Nevertheless, the most relevant features to be considered are those containing extreme values since they are expected to lead the search toward the main differences in team ranking, - and specifically here disentangling characteristics.
Whence one enters into the consideration about collusion, or competition-collaboration, - which might change from stage to stage. Further studies should concern whether such collusions can be observed in each stage, in other words how the teams move in the range around the extremum in the displayed curves, particularly in the plots. Clearly, the best teams have not much interest in bringing upward “middle class” teams, nor “middle class” teams bring upward the “low class” teams. No need to say that strategies allowing that a team could be better off by exerting a lower effort maybe hidden in examining classical indicators, but could be highlighted through indicators based on differences between reasonably unbiased criteria.
Whence the study of “ranks resulting of strategies and skills” through the classical indices, but further taking into account the rank as weight, could add further values to the reliability of hierarchy findings, and promote attractive competition.
7 Conclusions
Therefore, before optimizing strategies, one should be convinced of the validity of efficiency criteria. In the present report, one focuses on team ranking when the valuation outcome depends much on the performance of a subset of team members. It is argued and demonstrated through data found in cycling races. It is suggested that new indicators be compared to classical ones. This leads to observe features, like inequalities, clusters, “amount of competition”, “race temperature”, i.e., measures which provide quantified meanings outlining possible strategic goals arising in many team competitions. It is interesting to point that one can one observe management strategies, through indicator values comparison in a given race or when comparing races. Thus, even within such case studies, quantitative measures suggest considerations for further empirical modeling.
In conclusion, the indicators based on (i) the concept of difference between the distributions of data points, and (ii) on a probabilistic reasoning take into account the team final competition measure in an information-like approach, - the Shannon entropy. The proposed methodology is practical, simple, and useful: the study emphasizes that the method is based on scientific rationality and logical principles. A desirable characteristic of inequality measures is the existence of a graphical analogy with the indicator. This can enhance interpretability and help communicate results to non-experts. The study reported here above presents such graphical characteristics allowing the valuation of team performance, - Figs. 3-6. The new notion of “crucial core rank”, emphasising the main teams is well illustrated, and original.
In summary, the analyses of the proposed indicators point to a few practical features. For example, Table 1 and Table 6 allow to envisage race difficulties differences. The displayed values emphasize that the time distributions of and are similar for and , but differ from those of : moreover, each skewness is positive indicating wide distributions for the slowest teams. Each kurtosis value indicates whether balanced competition occurs in races. The same deduction holds when observing the , much shorter in , indicating a fiercer competition between the top teams. The values of , in Tables 3 - 5, when large, indicate that there is team emphasis on a distributive role according to riders skills. The as a relative temperature index appears to be a measure of the distribution of riders kinetic energy at the end of a race (or stage). An interesting output for race organizers and team managers arise from the ranking plots of indicators; those allow to observe the crucial race core made of the most competitive teams.
In brief, the numerical values point to differences in team strategy and goals in a given race based on rider skill levels. Thus, it seems admissible that the indicators are new useful measures, but surely need to be further examined and developed.
Notice that the definition of , , , and can be used not only for measures after the final stage, but also for every stage; see Appendix B. The same holds true for most of the defined and calculated coefficients here above, - but not in the first stage of the race, of course. Extensions, e.g., to the team member finishing rather than their can be easily done. (This might be valuable when discussing colorful jerseys in such races.) The selection of team members including the leader(s) can be based on previous statistics relying on the indicators. This enhances the answer to the question about designing and further optimizing strategies in a competitive environment.
Finally, it might be interesting to further discuss the Stage and Race Temperature indicators, e.g., through illustrative examples in order to appreciate the practical and theoretical meanings of the ”intensity” measures. It can be suggested, as further research, that one constructs hypothetical races with only a very small number of teams teams and choose sets of values to see the behaviour of stage and race temperature indicators. However, this would demand a different focus, - though surely essential to practical coaching.
The suggestion might clearly demand longer sets of investigations, in order to grasp a meaningful discussion on intensity sizes; this obviously demands several simulations covering several cases.
Nevertheless, races with only a few competing teams are not common. World Rally Championship (WRC) competitions might be of interest, - but the teams ranking rules are very different from those in cyclist races.
Thus, last but not least, even though the paper contains an original contribution to a special type of sporting activity, the lessons learned can be of more general use for many other activities, i.e., as long as the outcomes depend not only on the team effort but also on the performance of a subset of members of the team. Open questions remain on the collaboration-competition aspect of such races.
Appendix A:
Considerations on the Kendall coefficient and the weighted Kemeny distance.
How teams are ranked might have a substantial effect: in sport, the prize money is higher for the first teams than for the others. Sometimes the last teams face relegation and may loose sponsors. Thus, a swap in positions, due to different ruling, may be crucial.
The difference in hierarchies, including the scattering of the results, derived from ranking rules, can be classically measured through the Kendall rank-rank correlation coefficient, or equivalently through the Kemeny distance in terms of the number () of competing teams, which reads , when there is no ex aequo. However, Can and Csató point out that the Kendall coefficient does not take into account the precise position of dissimilarities when comparing two linear ranking sets. In particular, there is no discrimination about the (teams) relative position in each list.
In order to weight the position of discordant pairs, Csató has proposed a hyperbolic function: , , based on the lowest rank of an item of a discordant pair. A smoother weight distribution, , has also been proposed. Obviously, the classical Kendall coefficient corresponds to choosing a for permutations forcing one of the vectors to become identical to the other. The procedure can be repeated, appropriately weighting the various swaps, whence obtaining a “weighted Kemeny distance” between pairs of ranks; they are called , , and respectively.
For completeness, one can observe the number of concordant and discordant pairs , obtain the “score” , thereafter the Kendall coefficient from , - when there is no ex aequo in the considered vectors. In the present cases, . The results are reported in Table 8.
Notice that practically, in order to compare the ranks of pairs of teams, it is first useful to organize the teams in alphabetical order, giving them the appropriate rank for a given indicator. In all studied cases in the main text, the ranking chosen is that corresponding to , as in Tables 3 - 5.
Notice that the indicators distances are necessarily ordered: , and is closer to than .
Appendix B:
Considerations for extensions to daily stages.
One can sketch how to specify the main text considerations toward a daily ranking mechanism, within a multi-stage race.
Recall that is the finishing time of one of the 3 fastest riders () of team for stage . The team final time is
[TABLE]
For the first stage, , of course, and . After the second stage, one claims according to UCI rules that
[TABLE]
After the 3rd stage,
[TABLE]
etc.
Next, consider the adjusted team time after two stages and , i.e., . This team adjusted time is not equal to , but is equal to the cumulative time of the best 3 () riders of the team the -th stage, i.e., . Etc.
This seems to represent better the team time evolution and leads to avoid Cipollini-like effects.
Acknowledgements : Thanks to reviewers and editor for their patience and comments.
Thanks to Prof. J. Miśkiewicz for much help on coding.
Data availability : data is freely available, see text.
Funding : Work was partially supported by the project ‘’A better understanding of socio-economic systems using Quantitative Methods from Physics”, funded by the European Union—Next generation EU and the Romanian Government under the National Recovery and Resilience Plan for Romania, contract no.760034/23.05.2023, code PNRR-C9-I8-CF 255/29.11.2022, through the Romanian Ministry of Research, Innovation and Digitalization, within Component 9, ‘’Investment I8”. Moreover, P.K. acknowledges the support of ‘Digital Finance - Reaching New Frontiers’ (Horizon Marie Sklodowska-Curie Actions Industrial Doctoral Network), Ref. Number 101119635.
Disclosure Statement on competing interest : Neither relevant financial nor non-financial competing interest has to be mentioned.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] 1. Kulakowski K. Understanding the analytic hierarchy process. Boca Raton, FL: CRC Press; 2020.
- 2[2] 2. Fritz F, Moretti S, Staudacher J. Social ranking problems at the interplay between social choice theory and coalitional games. Mathematics 2023; 11(24): 4905.
- 3[3] 3. Ausloos M, Rotundo G, Cerqueti R. A theory of best choice selection through objective arguments grounded in linear response theory concepts. Physics 2024; 6(2): 468-482.
- 4[4] 4. Kossi Y. Tournois séquentiels et compétition pour la prime d’excellence scientifique. Rev Fr Econ 2017; 32(4): 57-94. [in French]. Available from: h t t p s : / / d o i . o r g / 10.3917 / r f e .174.0057 https://doi.org/10.3917/rfe.174.0057
- 5[5] 5. Sanz-Menéndez L, Cruz-Castro L. University academics’ preferences for hiring and promotion systems. Eur J High Educ 2019; 9(2): 153-171.
- 6[6] 6. Jose VRR, Nau RF. Winkler RL. Scoring rules, generalized entropy, and utility maximization. Oper Res 2008; 56(5): 1146-1157.
- 7[7] 7. Csató L. Some impossibilities of ranking in generalized tournaments. Int Game Theory Rev 2019; 21(01): 1940002.
- 8[8] 8. Chebotarev P Yu., Shamis E. Characterizations of scoring methods for preference aggregation. Ann Oper Res 1998; 80(9): 299–332.
