Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network
Yu-Chuan Su, Tzu-Hsuan Chiu, Chun-Yen Yeh, Hsin-Fu Huang, Winston H., Hsu

TL;DR
This paper demonstrates that transfer learning from images to videos enables effective video recognition with limited training data, significantly reducing the need for large annotated video datasets.
Contribution
The authors propose a transfer learning approach from images to videos, enabling robust video recognition with only 4,000 videos and weakly labeled image data, reducing data requirements.
Findings
Transfer learning improves recognition accuracy on scarce data.
Only 4,000 videos needed for effective training.
Heuristic for meta-parameter selection developed.
Abstract
Unconstrained video recognition and Deep Convolution Network (DCN) are two active topics in computer vision recently. In this work, we apply DCNs as frame-based recognizers for video recognition. Our preliminary studies, however, show that video corpora with complete ground truth are usually not large and diverse enough to learn a robust model. The networks trained directly on the video data set suffer from significant overfitting and have poor recognition rate on the test set. The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult. To overcome the problem, we perform transfer learning from images to videos to utilize the knowledge in the weakly labeled image corpus for video recognition. The image corpus help to learn important visual patterns for natural images, while these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Human Pose and Action Recognition
MethodsConvolution
