Self-supervised Learning for Video Correspondence Flow

Zihang Lai; Weidi Xie

arXiv:1905.00875·cs.CV·July 30, 2019·56 cites

Self-supervised Learning for Video Correspondence Flow

Zihang Lai, Weidi Xie

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-supervised learning approach for video correspondence flow, leveraging natural video coherence to learn robust features for matching, and achieves state-of-the-art results on key video tasks.

Contribution

It proposes a novel self-supervised framework with an information bottleneck, recursive training over long sequences, and demonstrates significant performance improvements.

Findings

01

State-of-the-art results on DAVIS 2017 and JHMDB datasets.

02

Robust feature learning without manual annotations.

03

Enhanced performance with additional diverse training data.

Abstract

The objective of this paper is self-supervised learning of feature embeddings that are suitable for matching correspondences along the videos, which we term correspondence flow. By leveraging the natural spatial-temporal coherence in videos, we propose to train a ``pointer'' that reconstructs a target frame by copying pixels from a reference frame. We make the following contributions: First, we introduce a simple information bottleneck that forces the model to learn robust features for correspondence matching, and prevent it from learning trivial solutions, \eg matching based on low-level colour information. Second, to tackle the challenges from tracker drifting, due to complex object deformations, illumination changes and occlusions, we propose to train a recursive model over long temporal windows with scheduled sampling and cycle consistency. Third, we achieve state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zlai0/CorrFlow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Advanced Vision and Imaging