Unsupervised Video Understanding by Reconciliation of Posture   Similarities

Timo Milbich; Miguel Bautista; Ekaterina Sutter; Bjorn Ommer

arXiv:1708.01191·cs.CV·August 4, 2017

Unsupervised Video Understanding by Reconciliation of Posture Similarities

Timo Milbich, Miguel Bautista, Ekaterina Sutter, Bjorn Ommer

PDF

TL;DR

This paper introduces an unsupervised deep learning method for understanding human activities in videos by learning posture representations without manual annotations, enabling retrieval, super-resolution, and frame synthesis.

Contribution

It presents a novel unsupervised approach that combines sequence matching and CNNs to learn structured posture embeddings from raw video data.

Findings

01

Learns posture representations without supervision

02

Enables posture retrieval and temporal super-resolution

03

Allows frame synthesis based on learned embeddings

Abstract

Understanding human activity and being able to explain it in detail surpasses mere action classification by far in both complexity and value. The challenge is thus to describe an activity on the basis of its most fundamental constituents, the individual postures and their distinctive transitions. Supervised learning of such a fine-grained representation based on elementary poses is very tedious and does not scale. Therefore, we propose a completely unsupervised deep learning procedure based solely on video sequences, which starts from scratch without requiring pre-trained networks, predefined body models, or keypoints. A combinatorial sequence matching algorithm proposes relations between frames from subsets of the training data, while a CNN is reconciling the transitivity conflicts of the different subsets to learn a single concerted pose embedding despite changes in appearance across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.