TL;DR
This paper demonstrates that leveraging temporal coherence in video sequences can significantly enhance semi-supervised deep learning models' classification accuracy through incremental tuning, even with minimal supervised training.
Contribution
It provides evidence that temporal coherence can be effectively used for semi-supervised incremental tuning of deep architectures, improving accuracy with unlabeled video data.
Findings
Semi-supervised tuning improves classification accuracy.
Temporal coherence is crucial for effective semi-supervised learning.
Models can approach fully supervised performance with minimal supervision.
Abstract
Recent works demonstrated the usefulness of temporal coherence to regularize supervised training or to learn invariant features with deep architectures. In particular, enforcing smooth output changes while presenting temporally-closed frames from video sequences, proved to be an effective strategy. In this paper we prove the efficacy of temporal coherence for semi-supervised incremental tuning. We show that a deep architecture, just mildly trained in a supervised manner, can progressively improve its classification accuracy, if exposed to video sequences of unlabeled data. The extent to which, in some cases, a semi-supervised tuning allows to improve classification accuracy (approaching the supervised one) is somewhat surprising. A number of control experiments pointed out the fundamental role of temporal coherence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
