Video Summarization in a Multi-View Camera Network
Rameswar Panda, Abir Das, Amit K. Roy-Chowdhury

TL;DR
This paper introduces a novel multi-view video summarization framework that leverages intra- and inter-view correlations in a joint embedding space, improving summarization quality over existing methods.
Contribution
It proposes a new joint embedding approach for multi-view videos that captures correlations across views and employs sparse selection for summarization.
Findings
Outperforms state-of-the-art on benchmark datasets
Efficient eigenvalue-based solution for embedding learning
Effective multi-view video summarization results
Abstract
While most existing video summarization approaches aim to extract an informative summary of a single video, we propose a novel framework for summarizing multi-view videos by exploiting both intra- and inter-view content correlations in a joint embedding space. We learn the embedding by minimizing an objective function that has two terms: one due to intra-view correlations and another due to inter-view correlations across the multiple views. The solution can be obtained directly by solving one Eigen-value problem that is linear in the number of multi-view videos. We then employ a sparse representative selection approach over the learned embedding space to summarize the multi-view videos. Experimental results on several benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
