TL;DR
This paper introduces ILS-SUMM, a new unsupervised video summarization method based on Iterated Local Search, which significantly improves the quality of selected video shots by minimizing total distance under a duration constraint.
Contribution
The paper proposes a novel ILS-based algorithm for shot-based video summarization that outperforms previous greedy approaches in solution quality and scalability.
Findings
ILS-SUMM achieves lower total distance than previous methods.
The method scales well to longer videos with diverse lengths.
Experiments on new and existing datasets validate improved performance.
Abstract
In recent years, there has been an increasing interest in building video summarization tools, where the goal is to automatically create a short summary of an input video that properly represents the original content. We consider shot-based video summarization where the summary consists of a subset of the video shots which can be of various lengths. A straightforward approach to maximize the representativeness of a subset of shots is by minimizing the total distance between shots and their nearest selected shots. We formulate the task of video summarization as an optimization problem with a knapsack-like constraint on the total summary duration. Previous studies have proposed greedy algorithms to solve this problem approximately, but no experiments were presented to measure the ability of these methods to obtain solutions with low total distance. Indeed, our experiments on video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
