Summarizing First-Person Videos from Third Persons' Points of Views

Hsuan-I Ho; Wei-Chen Chiu; Yu-Chiang Frank Wang

arXiv:1711.08922·cs.CV·July 27, 2018

Summarizing First-Person Videos from Third Persons' Points of Views

Hsuan-I Ho, Wei-Chen Chiu, Yu-Chiang Frank Wang

PDF

Open Access

TL;DR

This paper introduces a semi-supervised deep neural network model that effectively summarizes first-person videos by leveraging annotated third-person videos and limited first-person data, addressing the challenge of viewpoint differences.

Contribution

A novel deep learning architecture designed for first-person video summarization using semi-supervised learning with mixed viewpoint data.

Findings

01

Effective summarization of first-person videos demonstrated

02

Model generalizes well across different viewpoints

03

Qualitative and quantitative results show improved performance

Abstract

Video highlight or summarization is among interesting topics in computer vision, which benefits a variety of applications like viewing, searching, or storage. However, most existing studies rely on training data of third-person videos, which cannot easily generalize to highlight the first-person ones. With the goal of deriving an effective model to summarize first-person videos, we propose a novel deep neural network architecture for describing and discriminating vital spatiotemporal information across videos with different points of view. Our proposed model is realized in a semi-supervised setting, in which fully annotated third-person videos, unlabeled first-person videos, and a small number of annotated first-person ones are presented during training. In our experiments, qualitative and quantitative evaluations on both benchmarks and our collected first-person video datasets are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques