Deep Image-to-Video Adaptation and Fusion Networks for Action   Recognition

Yang Liu; Zhaoyang Lu; Jing Li; Tao Yang; Chao Yao

arXiv:1911.10751·cs.CV·February 19, 2020

Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition

Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang, Chao Yao

PDF

TL;DR

This paper introduces DIVAFN, a deep learning framework that transfers knowledge from images to videos for action recognition by bridging domain and modality gaps using autoencoders and semantic representations.

Contribution

The paper proposes a novel unified deep model that combines domain-invariant learning and cross-modal fusion to improve video action recognition using image data.

Findings

01

Outperforms state-of-the-art methods on four datasets.

02

Effectively reduces modality shift among images, keyframes, and videos.

03

Enhances action recognition accuracy through semantic feature integration.

Abstract

Existing deep learning methods for action recognition in videos require a large number of labeled videos for training, which is labor-intensive and time-consuming. For the same action, the knowledge learned from different media types, e.g., videos and images, may be related and complementary. However, due to the domain shifts and heterogeneous feature representations between videos and images, the performance of classifiers trained on images may be dramatically degraded when directly deployed to videos. In this paper, we propose a novel method, named Deep Image-to-Video Adaptation and Fusion Networks (DIVAFN), to enhance action recognition in videos by transferring knowledge from images using video keyframes as a bridge. The DIVAFN is a unified deep learning model, which integrates domain-invariant representations learning and cross-modal feature fusion into a unified optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729