COACH: Collaborative Agents for Contextual Highlighting -- A Multi-Agent Framework for Sports Video Analysis
Tsz-To Wong, Ching-Chun Huang, Hong-Han Shuai

TL;DR
This paper introduces COACH, a flexible multi-agent framework for sports video analysis that improves understanding of temporal context, enhances adaptability across tasks, and offers interpretability in sports video understanding.
Contribution
We propose a reconfigurable multi-agent system that specializes in different analysis aspects, enabling adaptive, scalable, and interpretable sports video understanding across multiple tasks.
Findings
Demonstrated adaptability in badminton analysis tasks.
Bridged fine-grained event detection and global semantic organization.
Showcased improved generalization and interpretability.
Abstract
Intelligent sports video analysis demands a comprehensive understanding of temporal context, from micro-level actions to macro-level game strategies. Existing end-to-end models often struggle with this temporal hierarchy, offering solutions that lack generalization, incur high development costs for new tasks, and suffer from poor interpretability. To overcome these limitations, we propose a reconfigurable Multi-Agent System (MAS) as a foundational framework for sports video understanding. In our system, each agent functions as a distinct "cognitive tool" specializing in a specific aspect of analysis. The system's architecture is not confined to a single temporal dimension or task. By leveraging iterative invocation and flexible composition of these agents, our framework can construct adaptive pipelines for both short-term analytic reasoning (e.g., Rally QA) and long-term generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Artificial Intelligence in Games · Human Motion and Animation
