Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark
Sen Pei, Shixiong Xu, and Xiaojie Jin

TL;DR
This paper introduces a novel incremental learning method called Global Prototype Encoding for video highlights detection, addressing scalability issues by adapting to new domains with a new dataset, LiveFood.
Contribution
The paper presents the first exploration of video highlights detection in an incremental learning setting and proposes GPE, a method that adapts to new highlight domains effectively.
Findings
GPE outperforms existing domain incremental learning methods on LiveFood.
GPE achieves significant improvements in mean Average Precision (mAP) across all domains.
GPE maintains competitive performance on classic datasets.
Abstract
Video highlights detection (VHD) is an active research field in computer vision, aiming to locate the most user-appealing clips given raw video inputs. However, most VHD methods are based on the closed world assumption, i.e., a fixed number of highlight categories is defined in advance and all training data are available beforehand. Consequently, existing methods have poor scalability with respect to increasing highlight domains and training data. To address above issues, we propose a novel video highlights detection method named Global Prototype Encoding (GPE) to learn incrementally for adapting to new domains via parameterized prototypes. To facilitate this new research direction, we collect a finely annotated dataset termed LiveFood, including over 5,100 live gourmet videos that consist of four domains: ingredients, cooking, presentation, and eating. To the best of our knowledge,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Image Enhancement Techniques
