Do We Need Large VLMs for Spotting Soccer Actions?
Ritabrata Chakraborty, Rajatsubhra Chakraborty, Avijit Dasgupta, Sandeep Chaurasia

TL;DR
This paper explores replacing traditional dense video processing in soccer action spotting with a lightweight, text-based approach using Large Language Models, achieving comparable accuracy without video data processing.
Contribution
It introduces a novel text-based method employing LLMs to identify soccer actions, reducing computational complexity compared to video-based models.
Findings
Language-based approach performs near state-of-the-art in action spotting
Zero video processing required, saving computational resources
Method is scalable and efficient for large-scale match analysis
Abstract
Traditional video-based tasks like soccer action spotting rely heavily on visual inputs, often requiring complex and computationally expensive models to process dense video data. We propose a shift from this video-centric approach to a text-based task, making it lightweight and scalable by utilizing Large Language Models (LLMs) instead of Vision-Language Models (VLMs). We posit that expert commentary, which provides rich descriptions and contextual cues contains sufficient information to reliably spot key actions in a match. To demonstrate this, we employ a system of three LLMs acting as judges specializing in outcome, excitement, and tactics for spotting actions in soccer matches. Our experiments show that this language-centric approach performs effectively in detecting critical match events coming close to state-of-the-art video-based spotters while using zero video processing compute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
