Human Activity Recognition for Edge Devices

Manjot Bilkhu; Hammababdullah Ayyubi

arXiv:1903.07563·cs.CV·March 19, 2019·1 cites

Human Activity Recognition for Edge Devices

Manjot Bilkhu, Hammababdullah Ayyubi

PDF

Open Access

TL;DR

This paper investigates how to perform video activity recognition on edge devices efficiently, achieving comparable accuracy to state-of-the-art models like I3D and C3D with significantly reduced memory requirements.

Contribution

The authors demonstrate that comparable activity recognition accuracy can be achieved on edge devices by modifying testing procedures and exploring different architectures, including ResNet.

Findings

01

Achieved 84.54% top-1 accuracy on UCF-101 with RGB frames.

02

Reduced memory usage to one-tenth of traditional models.

03

Validated effectiveness of alternative architectures like ResNet.

Abstract

Video activity Recognition has recently gained a lot of momentum with the release of massive Kinetics (400 and 600) data. Architectures such as I3D and C3D networks have shown state-of-the-art performances for activity recognition. The one major pitfall with these state-of-the-art networks is that they require a lot of compute. In this paper we explore how we can achieve comparable results to these state-of-the-art networks for devices-on-edge. We primarily explore two architectures - I3D and Temporal Segment Network. We show that comparable results can be achieved using one tenth the memory usage by changing the testing procedure. We also report our results on Resnet architecture as our backbone apart from the original Inception architecture. Specifically, we achieve 84.54\% top-1 accuracy on UCF-101 dataset using only RGB frames.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Anomaly Detection Techniques and Applications

MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling · Residual Connection