MatchTime: Towards Automatic Soccer Game Commentary Generation
Jiayuan Rao, Haoning Wu, Chang Liu, Yanfeng Wang, Weidi Xie

TL;DR
This paper introduces MatchTime, a high-quality dataset and model for automatic soccer commentary generation, addressing data misalignment issues and achieving state-of-the-art results in the task.
Contribution
It presents a novel multi-modal alignment pipeline, curated dataset, and a commentary generation model, improving data quality and performance in soccer commentary automation.
Findings
Alignment pipeline significantly improves dataset quality.
Curated dataset enhances model training and performance.
Model achieves state-of-the-art results in commentary generation.
Abstract
Soccer is a globally popular sport with a vast audience, in this paper, we consider constructing an automatic soccer game commentary model to improve the audiences' viewing experience. In general, we make the following contributions: First, observing the prevalent video-text misalignment in existing datasets, we manually annotate timestamps for 49 matches, establishing a more robust benchmark for soccer game commentary generation, termed as SN-Caption-test-align; Second, we propose a multi-modal temporal alignment pipeline to automatically correct and filter the existing dataset at scale, creating a higher-quality soccer game commentary dataset for training, denoted as MatchTime; Third, based on our curated dataset, we train an automatic commentary generation model, named MatchVoice. Extensive experiments and ablation studies have demonstrated the effectiveness of our alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVideo Analysis and Summarization · Sports Analytics and Performance · Artificial Intelligence in Games
