Learning to Summarize Videos by Contrasting Clips

Ivan Sosnovik; Artem Moskalev; Cees Kaandorp; Arnold Smeulders

arXiv:2301.05213·cs.CV·April 20, 2023·1 cites

Learning to Summarize Videos by Contrasting Clips

Ivan Sosnovik, Artem Moskalev, Cees Kaandorp, Arnold Smeulders

PDF

Open Access

TL;DR

This paper introduces an unsupervised video summarization method using contrastive learning, which effectively creates diverse summaries without relying on labeled data by contrasting top-k features.

Contribution

It proposes a novel contrastive learning framework that contrasts top-k features for unsupervised video summarization, improving diversity and informativeness of summaries.

Findings

01

Achieves meaningful summaries without labeled data

02

Outperforms existing methods on benchmark datasets

03

Enhances diversity of video summaries

Abstract

Video summarization aims at choosing parts of a video that narrate a story as close as possible to the original one. Most of the existing video summarization approaches focus on hand-crafted labels. As the number of videos grows exponentially, there emerges an increasing need for methods that can learn meaningful summarizations without labeled annotations. In this paper, we aim to maximally exploit unsupervised video summarization while concentrating the supervision to a few, personalized labels as an add-on. To do so, we formulate the key requirements for the informative video summarization. Then, we propose contrastive learning as the answer to both questions. To further boost Contrastive video Summarization (CSUM), we propose to contrast top-k features instead of a mean video feature as employed by the existing method, which we implement with a differentiable top-k feature selector.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Music and Audio Processing · Human Motion and Animation

MethodsContrastive Learning