Self-Supervised Video Representation Learning by Video Incoherence Detection
Haozhi Cao, Yuecong Xu, Jianfei Yang, Kezhi Mao, Lihua Xie, Jianxiong, Yin, Simon See

TL;DR
This paper presents a self-supervised video representation learning approach that detects incoherence within videos, enabling the model to understand high-level video semantics and improve performance on action recognition and retrieval tasks.
Contribution
It introduces a novel incoherence detection framework combined with intra-video contrastive learning for enhanced self-supervised video representation learning.
Findings
Achieves state-of-the-art results on multiple datasets.
Outperforms previous coherence-based methods.
Effective across various backbone networks.
Abstract
This paper introduces a novel self-supervised method that leverages incoherence detection for video representation learning. It roots from the observation that visual systems of human beings can easily identify video incoherence based on their comprehensive understanding of videos. Specifically, the training sample, denoted as the incoherent clip, is constructed by multiple sub-clips hierarchically sampled from the same raw video with various lengths of incoherence between each other. The network is trained to learn high-level representation by predicting the location and length of incoherence given the incoherent clip as input. Additionally, intra-video contrastive learning is introduced to maximize the mutual information between incoherent clips from the same raw video. We evaluate our proposed method through extensive experiments on action recognition and video retrieval utilizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Video Analysis and Summarization
MethodsContrastive Learning
