Condensing a Sequence to One Informative Frame for Video Recognition

Zhaofan Qiu; Ting Yao; Yan Shu; Chong-Wah Ngo; Tao Mei

arXiv:2201.04022·cs.CV·January 12, 2022

Condensing a Sequence to One Informative Frame for Video Recognition

Zhaofan Qiu, Ting Yao, Yan Shu, Chong-Wah Ngo, Tao Mei

PDF

Open Access

TL;DR

This paper introduces a novel method called Informative Frame Synthesis (IFS) that condenses videos into a single informative frame capturing essential spatio-temporal information for efficient video recognition.

Contribution

The paper proposes an end-to-end trainable IFS architecture with multiple objectives and regularizers, enabling effective video-to-image synthesis for recognition tasks.

Findings

01

IFS outperforms baseline methods in video recognition accuracy.

02

It improves performance on both 2D and 3D image-based networks.

03

IFS achieves comparable results to state-of-the-art methods with less computation.

Abstract

Video is complex due to large variations in motion and rich content in fine-grained visual details. Abstracting useful information from such information-intensive media requires exhaustive computing resources. This paper studies a two-step alternative that first condenses the video sequence to an informative "frame" and then exploits off-the-shelf image recognition system on the synthetic frame. A valid question is how to define "useful information" and then distill it from a video sequence down to one synthetic frame. This paper presents a novel Informative Frame Synthesis (IFS) architecture that incorporates three objective tasks, i.e., appearance reconstruction, video categorization, motion estimation, and two regularizers, i.e., adversarial learning, color consistency. Each task equips the synthetic frame with one ability, while each regularizer enhances its visual quality. With…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques