Deep Reinforcement Learning for Unsupervised Video Summarization with   Diversity-Representativeness Reward

Kaiyang Zhou; Yu Qiao; Tao Xiang

arXiv:1801.00054·cs.CV·February 15, 2018·78 cites

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

Kaiyang Zhou, Yu Qiao, Tao Xiang

PDF

Open Access 5 Repos

TL;DR

This paper introduces an unsupervised deep reinforcement learning approach for video summarization that optimizes diversity and representativeness without labels, outperforming many existing methods.

Contribution

The paper presents a novel unsupervised reinforcement learning framework with a custom reward function for diversity and representativeness in video summarization.

Findings

01

Outperforms state-of-the-art unsupervised methods

02

Comparable or superior to supervised approaches

03

Effective in producing diverse and representative summaries

Abstract

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos. In this paper, we formulate video summarization as a sequential decision-making process and develop a deep summarization network (DSN) to summarize videos. DSN predicts for each video frame a probability, which indicates how likely a frame is selected, and then takes actions based on the probability distributions to select frames, forming video summaries. To train our DSN, we propose an end-to-end, reinforcement learning-based framework, where we design a novel reward function that jointly accounts for diversity and representativeness of generated summaries and does not rely on labels or user interactions at all. During training, the reward function judges how diverse and representative the generated summaries are, while DSN…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Music and Audio Processing · Human Pose and Action Recognition