Predicting Important Objects for Egocentric Video Summarization

Yong Jae Lee; Kristen Grauman

arXiv:1505.04803·cs.CV·May 20, 2015

Predicting Important Objects for Egocentric Video Summarization

Yong Jae Lee, Kristen Grauman

PDF

TL;DR

This paper introduces a novel egocentric video summarization method that identifies and highlights important objects and interactions, producing concise storyboards without user-specific training.

Contribution

It develops a saliency-based importance prediction model that generalizes across users and contexts, improving object-driven video summarization.

Findings

01

Outperforms existing saliency and summarization techniques.

02

Effectively predicts importance of unseen objects and people.

03

Produces compact, meaningful video summaries.

Abstract

We present a video summarization approach for egocentric or "wearable" camera data. Given hours of video, the proposed method produces a compact storyboard summary of the camera wearer's day. In contrast to traditional keyframe selection techniques, the resulting summary focuses on the most important objects and people with which the camera wearer interacts. To accomplish this, we develop region cues indicative of high-level saliency in egocentric video---such as the nearness to hands, gaze, and frequency of occurrence---and learn a regressor to predict the relative importance of any new region based on these cues. Using these predictions and a simple form of temporal event detection, our method selects frames for the storyboard that reflect the key object-driven happenings. We adjust the compactness of the final summary given either an importance selection criterion or a length budget;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.