Multiview Pseudo-Labeling for Semi-supervised Learning from Video

Bo Xiong; Haoqi Fan; Kristen Grauman; Christoph Feichtenhofer

arXiv:2104.00682·cs.CV·April 2, 2021

Multiview Pseudo-Labeling for Semi-supervised Learning from Video

Bo Xiong, Haoqi Fan, Kristen Grauman, Christoph Feichtenhofer

PDF

Open Access

TL;DR

This paper introduces a multiview pseudo-labeling framework for semi-supervised video learning that leverages appearance and motion views to improve representation quality without extra inference costs.

Contribution

It proposes a novel multiview pseudo-labeling approach that uses complementary appearance and motion information to enhance semi-supervised video learning.

Findings

01

Outperforms purely supervised models on multiple datasets

02

Achieves competitive results compared to existing self-supervised methods

03

No additional inference overhead due to shared model architecture

Abstract

We present a multiview pseudo-labeling approach to video learning, a novel framework that uses complementary views in the form of appearance and motion information for semi-supervised learning in video. The complementary views help obtain more reliable pseudo-labels on unlabeled video, to learn stronger video representations than from purely supervised data. Though our method capitalizes on multiple views, it nonetheless trains a model that is shared across appearance and motion input and thus, by design, incurs no additional computation overhead at inference time. On multiple video recognition datasets, our method substantially outperforms its supervised counterpart, and compares favorably to previous work on standard benchmarks in self-supervised video representation learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods