Compare and Select: Video Summarization with Multi-Agent Reinforcement   Learning

Tianyu Liu

arXiv:2007.14552·cs.CV·July 30, 2020·1 cites

Compare and Select: Video Summarization with Multi-Agent Reinforcement Learning

Tianyu Liu

PDF

Open Access

TL;DR

This paper introduces CoSNet, a multi-agent reinforcement learning framework for video summarization that models user-like decision processes, effectively handling subjectivity and outperforming existing methods.

Contribution

It proposes a novel multi-agent reinforcement learning approach inspired by user behavior, combining comparison and selection networks for improved video summarization.

Findings

01

Outperforms state-of-the-art unsupervised methods

02

Surpasses most supervised methods with full rewards

03

Effective in modeling subjective user preferences

Abstract

Video summarization aims at generating concise video summaries from the lengthy videos, to achieve better user watching experience. Due to the subjectivity, purely supervised methods for video summarization may bring the inherent errors from the annotations. To solve the subjectivity problem, we study the general user summarization process. General users usually watch the whole video, compare interesting clips and select some clips to form a final summary. Inspired by the general user behaviours, we formulate the summarization process as multiple sequential decision-making processes, and propose Comparison-Selection Network (CoSNet) based on multi-agent reinforcement learning. Each agent focuses on a video clip and constantly changes its focus during the iterations, and the final focus clips of all agents form the summary. The comparison network provides the agent with the visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Music and Audio Processing · Human Motion and Animation