TL;DR
GameSight is a two-stage knowledge-enhanced visual reasoning model that generates accurate, context-aware soccer commentary, improving entity alignment and commentary quality over previous methods.
Contribution
The paper introduces GameSight, a novel two-stage approach that combines visual reasoning and external knowledge to produce more accurate and engaging soccer commentary.
Findings
Player alignment accuracy improved by 18.5%
Outperforms previous models in segment-level accuracy and commentary quality
Enhances game context relevance and structural composition
Abstract
Soccer commentary plays a crucial role in enhancing the soccer game viewing experience for audiences. Previous studies in automatic soccer commentary generation typically adopt an end-to-end method to generate anonymous live text commentary. Such generated commentary is insufficient in the context of real-world live televised commentary, as it contains anonymous entities, context-dependent errors and lacks statistical insights of the game events. To bridge the gap, we propose GameSight, a two-stage model to address soccer commentary generation as a knowledge-enhanced visual reasoning task, enabling live-televised-like knowledgeable commentary with accurate reference to entities (players and teams). GameSight starts by performing visual reasoning to align anonymous entities with fine-grained visual and contextual analysis. Subsequently, the entity-aligned commentary is refined with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
