Unsupervised Object-Level Video Summarization with Online Motion Auto-Encoder
Yujia Zhang, Xiaodan Liang, Dingwen Zhang, Min Tan, and Eric P. Xing

TL;DR
This paper introduces an unsupervised, online framework for object-level video summarization that captures key object motions, advancing the understanding of fine-grained semantic and motion information in videos.
Contribution
It proposes a novel online motion Auto-Encoder framework for unsupervised object-level video summarization, focusing on key object motions in an online manner.
Findings
Effective on surveillance and public datasets
Captures fine-grained object motions
Outperforms existing methods in unsupervised summarization
Abstract
Unsupervised video summarization plays an important role on digesting, browsing, and searching the ever-growing videos every day, and the underlying fine-grained semantic and motion information (i.e., objects of interest and their key motions) in online videos has been barely touched. In this paper, we investigate a pioneer research direction towards the fine-grained unsupervised object-level video summarization. It can be distinguished from existing pipelines in two aspects: extracting key motions of participated objects, and learning to summarize in an unsupervised and online manner. To achieve this goal, we propose a novel online motion Auto-Encoder (online motion-AE) framework that functions on the super-segmented object motion clips. Comprehensive experiments on a newly-collected surveillance dataset and public datasets have demonstrated the effectiveness of our proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Music and Audio Processing · Human Pose and Action Recognition
