Real-Time Action Detection in Video Surveillance using Sub-Action   Descriptor with Multi-CNN

Cheng-Bin Jin; Shengzhe Li; and Hakil Kim

arXiv:1710.03383·cs.CV·October 11, 2017·27 cites

Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN

Cheng-Bin Jin, Shengzhe Li, and Hakil Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces a detailed sub-action descriptor for real-time multi-person action detection in video surveillance, utilizing multi-CNNs to improve accuracy and speed on large-scale datasets.

Contribution

It proposes a novel sub-action descriptor with three levels and a multi-CNN based detection model, enhancing recognition detail and real-time performance.

Findings

01

Achieved 76.6% mAP on ICVL dataset

02

Outperformed state-of-the-art on KTH dataset

03

Operates at 25 fps on ICVL and 80 fps on KTH

Abstract

When we say a person is texting, can you tell the person is walking or sitting? Emphatically, no. In order to solve this incomplete representation problem, this paper presents a sub-action descriptor for detailed action detection. The sub-action descriptor consists of three levels: the posture, the locomotion, and the gesture level. The three levels give three sub-action categories for one action to address the representation problem. The proposed action detection model simultaneously localizes and recognizes the actions of multiple individuals in video surveillance using appearance-based temporal features with multi-CNN. The proposed approach achieved a mean average precision (mAP) of 76.6% at the frame-based and 83.5% at the video-based measurement on the new large-scale ICVL video surveillance dataset that the authors introduce and make available to the community with this paper.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ChengBinJin/ActionViewer
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Hand Gesture Recognition Systems