# Diversity-aware Multi-Video Summarization

**Authors:** Rameswar Panda, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury

arXiv: 1706.03123 · 2017-10-11

## TL;DR

This paper introduces an unsupervised, diversity-aware multi-video summarization framework that leverages complementarity among videos to produce diverse, representative summaries, supported by a new benchmark dataset and extensive experiments.

## Contribution

It proposes a novel diversity-aware sparse optimization method for multi-video summarization and introduces the Tour20 dataset for benchmarking.

## Key findings

- Outperforms state-of-the-art methods in multi-video summarization
- Effectively captures complementarity among videos
- Demonstrates robustness across multiple datasets

## Abstract

Most video summarization approaches have focused on extracting a summary from a single video; we propose an unsupervised framework for summarizing a collection of videos. We observe that each video in the collection may contain some information that other videos do not have, and thus exploring the underlying complementarity could be beneficial in creating a diverse informative summary. We develop a novel diversity-aware sparse optimization method for multi-video summarization by exploring the complementarity within the videos. Our approach extracts a multi-video summary which is both interesting and representative in describing the whole video collection. To efficiently solve our optimization problem, we develop an alternating minimization algorithm that minimizes the overall objective function with respect to one video at a time while fixing the other videos. Moreover, we introduce a new benchmark dataset, Tour20, that contains 140 videos with multiple human created summaries, which were acquired in a controlled experiment. Finally, by extensive experiments on the new Tour20 dataset and several other multi-view datasets, we show that the proposed approach clearly outperforms the state-of-the-art methods on the two problems-topic-oriented video summarization and multi-view video summarization in a camera network.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.03123/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1706.03123/full.md

## References

79 references — full list in the complete paper: https://tomesphere.com/paper/1706.03123/full.md

---
Source: https://tomesphere.com/paper/1706.03123