DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices   Compressed with Quantization and Tensorization

Yuan Cheng; Guangya Li; Hai-Bao Chen; Sheldon X.-D. Tan; Hao Yu

arXiv:1805.07935·cs.CV·June 8, 2018·5 cites

DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization

Yuan Cheng, Guangya Li, Hai-Bao Chen, Sheldon X.-D. Tan, Hao Yu

PDF

Open Access

TL;DR

DEEPEYE is a compact, accurate video comprehension system for terminal devices that combines quantization and tensorization techniques to significantly reduce model size and computation while maintaining high accuracy.

Contribution

The paper introduces a novel integrated approach using quantization and tensorization for efficient video detection and recognition on resource-constrained devices.

Findings

01

Achieves 3.994x model compression with only 0.47% mAP loss

02

Reduces parameters by 15,047x and speeds up by 2.87x

03

Improves accuracy by 16.58% on benchmark datasets

Abstract

As it requires a huge number of parameters when exposed to high dimensional inputs in video detection and classification, there is a grand challenge to develop a compact yet accurate video comprehension at terminal devices. Current works focus on optimizations of video detection and classification in a separated fashion. In this paper, we introduce a video comprehension (object detection and action recognition) system for terminal devices, namely DEEPEYE. Based on You Only Look Once (YOLO), we have developed an 8-bit quantization method when training YOLO; and also developed a tensorized-compression method of Recurrent Neural Network (RNN) composed of features extracted from YOLO. The developed quantization and tensorization can significantly compress the original network model yet with maintained accuracy. Using the challenging video datasets: MOMENTS and UCF11 as benchmarks, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Multimodal Machine Learning Applications