FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos
Jeremie Ochin (CAOR), Raphael Chekroun, Bogdan Stanciulescu (CAOR), Sotiris Manitsaris (CAOR)

TL;DR
FOOTPASS is a comprehensive dataset and benchmark designed to improve automated play-by-play action spotting in soccer videos by integrating multi-modal, multi-agent data with tactical knowledge for enhanced sports analytics.
Contribution
It introduces the first multi-modal, multi-agent tactical dataset for soccer, combining computer vision and tactical priors to automate play-by-play annotation.
Findings
Enables development of player-centric action spotting methods.
Supports integration of vision outputs with tactical knowledge.
Facilitates more reliable and automated soccer event annotation.
Abstract
Soccer video understanding has motivated the creation of datasets for tasks such as temporal action localization, spatiotemporal action detection (STAD), or multiobject tracking (MOT). The annotation of structured sequences of events (who does what, when, and where) used for soccer analytics requires a holistic approach that integrates both STAD and MOT. However, current action recognition methods remain insufficient for constructing reliable play-by-play data and are typically used to assist rather than fully automate annotation. Parallel research has advanced tactical modeling, trajectory forecasting, and performance analysis, all grounded in game-state and play-by-play data. This motivates leveraging tactical knowledge as a prior to support computer-vision-based predictions, enabling more automated and reliable extraction of play-by-play data. We introduce Footovision Play-by-Play…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Anomaly Detection Techniques and Applications
