Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery
Chamath Ranasinghe, Uthayasanker Thayasivam

TL;DR
This paper introduces a lightweight team-aware football player tracking system combining SAM, CSRT, and jersey color models, achieving high accuracy and robustness in occlusion scenarios with real-time performance.
Contribution
It presents a novel appearance-based tracking approach integrating SAM with traditional trackers and jersey color cues for improved occlusion recovery in football videos.
Findings
Achieves 7.6-7.7 FPS with stable memory usage.
Maintains 100% success in light occlusions, 90% in crowded scenes.
Recovers 50% of heavy occlusions with appearance re-identification.
Abstract
Football player tracking is challenged by frequent occlusions, similar appearances, and rapid motion in crowded scenes. This paper presents a lightweight SAM-based tracking method combining the Segment Anything Model (SAM) with CSRT trackers and jersey color-based appearance models. We propose a team-aware tracking system that uses SAM for precise initialization and HSV histogram-based re-identification to improve occlusion recovery. Our evaluation measures three dimensions: processing speed (FPS and memory), tracking accuracy (success rate and box stability), and robustness (occlusion recovery and identity consistency). Experiments on football video sequences show that the approach achieves 7.6-7.7 FPS with stable memory usage (~1880 MB), maintaining 100 percent tracking success in light occlusions and 90 percent in crowded penalty-box scenarios with 5 or more players. Appearance-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
