# Efficient talent identification in women’s football: A ranking-based approach for goal scoring analysis

**Authors:** Songyi Song, Hee-Su Kim

PMC · DOI: 10.1371/journal.pone.0342115 · PLOS One · 2026-02-24

## TL;DR

This paper introduces a ranking-based method to efficiently identify goal-scoring talents in women’s football using performance data.

## Contribution

A novel ranking-based approach is proposed to address class imbalance in goal-scoring analysis and improve scouting efficiency.

## Key findings

- LightGBM captured 79.4% of goal-scoring observations in the top 20% of ranked data.
- Tactical availability and workload indicators were key features for goal prediction.
- Ranking-based evaluation outperformed traditional classification metrics in scouting efficiency.

## Abstract

Individual goal-scoring analysis in women’s football faces severe class imbalance and limited scouting resources, where classification metrics alone do not capture operational efficiency. We analyzed 2,535 non-goalkeeper player-match observations from the 2023 FIFA Women’s World Cup (736 unique players) with 51 performance features, excluding match-outcome variables to emphasize individual actions. Using nested cross-validation, LightGBM captured 79.4% of goal-scoring observations within the top 20% of ranked observations; an out-of-bag (OOB) bootstrap gains analysis yielded 73.9% capture at Top 20% (lift = 3.69x; 95% CI: 63.9%−84.3%). Permutation and SHAP consensus highlighted tactical availability (Total Offers) and combined technical/physical workload indicators (Passes Attempted, Jogging Distance, Top Speed). This proof-of-concept study shows that ranking-based evaluation improves scouting efficiency using basic match statistics, while thresholds and feature weights require validation in other competitive contexts.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** ML (MESH:C537366)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12931764/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12931764/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12931764/full.md

---
Source: https://tomesphere.com/paper/PMC12931764