SCBench: A Sports Commentary Benchmark for Video LLMs
Kuangzhi Ge, Lingjun Chen, Kevin Zhang, Yulin Luo, Tianyu Shi,, Liaoyuan Fan, Xiang Li, Guanqun Wang, Shanghang Zhang

TL;DR
SCBench introduces a sports video commentary benchmark with a new dataset and metric, enabling detailed evaluation of Video LLMs' temporal and visual understanding in complex, real-world sports scenarios.
Contribution
The paper presents a novel sports commentary generation task, a specialized dataset, and a new six-dimensional evaluation metric for Video LLMs, addressing limitations of existing benchmarks.
Findings
InternVL-Chat-2 achieves top performance with a score of 5.44.
The proposed metric and dataset enable more nuanced assessment of models.
Evaluation reveals current models still have significant room for improvement.
Abstract
Recently, significant advances have been made in Video Large Language Models (Video LLMs) in both academia and industry. However, methods to evaluate and benchmark the performance of different Video LLMs, especially their fine-grained, temporal visual capabilities, remain very limited. On one hand, current benchmarks use relatively simple videos (e.g., subtitled movie clips) where the model can understand the entire video by processing just a few frames. On the other hand, their datasets lack diversity in task format, comprising only QA or multi-choice QA, which overlooks the models' capacity for generating in-depth and precise texts. Sports videos, which feature intricate visual information, sequential events, and emotionally charged commentary, present a critical challenge for Video LLMs, making sports commentary an ideal benchmarking task. Inspired by these challenges, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Digital Rights Management and Security · Sports Analytics and Performance
