Dynamic texture and scene classification by transferring deep image features
Xianbiao Qi, Chun-Guang Li, Guoying Zhao, Xiaopeng Hong, Matti, Pietik\"ainen

TL;DR
This paper introduces a transfer learning approach using deep image features for dynamic texture and scene classification, leveraging a pre-trained ConvNet to extract robust features from video frames.
Contribution
It proposes a novel two-level feature extraction scheme called Transferred ConvNet Feature (TCoF), utilizing spatial and temporal information for improved video classification.
Findings
TCoF outperforms existing methods on benchmark datasets.
Spatial and temporal TCoF schemes achieve superior accuracy.
The approach effectively handles variations in illumination, viewpoint, and motion.
Abstract
Dynamic texture and scene classification are two fundamental problems in understanding natural video content. Extracting robust and effective features is a crucial step towards solving these problems. However the existing approaches suffer from the sensitivity to either varying illumination, or viewpoint changing, or even camera motion, and/or the lack of spatial information. Inspired by the success of deep structures in image classification, we attempt to leverage a deep structure to extract feature for dynamic texture and scene classification. To tackle with the challenges in training a deep structure, we propose to transfer some prior knowledge from image domain to video domain. To be specific, we propose to apply a well-trained Convolutional Neural Network (ConvNet) as a mid-level feature extractor to extract features from each frame, and then form a representation of a video by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
