Automatic Curation of Golf Highlights using Multimodal Excitement Features
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, and John Kent, John R. Smith, Rogerio S. Feris

TL;DR
This paper presents a multimodal system that automatically identifies exciting moments in golf videos by analyzing player reactions, crowd noise, and commentator tone, reducing manual effort and enabling personalized highlight reels.
Contribution
It introduces a novel multimodal approach for sports highlight curation that combines multiple data sources and reduces manual annotation, demonstrated in a real-world golf tournament setting.
Findings
Successfully extracted highlights from live golf streams over four days
Achieved accurate start and end frame detection of key moments
Reduced manual annotation through correlation of modalities
Abstract
The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media. Yet, it requires labor-intensive video editing. We propose a novel approach for auto-curating sports highlights, and use it to create a real-world system for the editorial aid of golf highlight reels. Our method fuses information from the players' reactions (action recognition such as high-fives and fist pumps), spectators (crowd cheering), and commentator (tone of the voice and word analysis) to determine the most interesting moments of a game. We accurately identify the start and end frames of key shot highlights with additional metadata, such as the player's name and the hole number, allowing personalized content summarization and retrieval. In addition, we introduce new techniques for learning our classifiers with reduced manual training data annotation by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
