Foul prediction with estimated poses from soccer broadcast video
Jiale Fang, Calvin Yeung, Keisuke Fujii

TL;DR
This paper presents a deep learning approach that combines video, bounding box, image, and pose data to predict fouls in soccer, advancing behavior prediction in sports analytics.
Contribution
It introduces a novel multi-modal deep learning model and a new dataset for soccer foul prediction, integrating pose estimation with other visual cues.
Findings
Full model outperforms ablated versions
All modalities contribute to prediction accuracy
Pose information is valuable for foul prediction
Abstract
Recent advances in computer vision have made significant progress in tracking and pose estimation of sports players. However, there have been fewer studies on behavior prediction with pose estimation in sports, in particular, the prediction of soccer fouls is challenging because of the smaller image size of each player and of difficulty in the usage of e.g., the ball and pose information. In our research, we introduce an innovative deep learning approach for anticipating soccer fouls. This method integrates video data, bounding box positions, image details, and pose information by curating a novel soccer foul dataset. Our model utilizes a combination of convolutional and recurrent neural networks (CNNs and RNNs) to effectively merge information from these four modalities. The experimental results show that our full model outperformed the ablated models, and all of the RNN modules,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Video Analysis and Summarization · Sports Dynamics and Biomechanics
