An Expressive Deep Model for Human Action Parsing from A Single Image

Zhujin Liang; Xiaolong Wang; Rui Huang; Liang Lin

arXiv:1502.00501·cs.CV·February 3, 2015

An Expressive Deep Model for Human Action Parsing from A Single Image

Zhujin Liang, Xiaolong Wang, Rui Huang, Liang Lin

PDF

TL;DR

This paper introduces a deep learning framework that effectively recognizes human actions from still images by integrating human layout and context, overcoming pose and appearance variations without temporal data.

Contribution

It develops an expressive deep model using Deep Belief Nets that fuses multiple noisy sources and leverages labeled data for improved action recognition from single images.

Findings

01

Outperforms state-of-the-art methods in action recognition accuracy.

02

Robust to unreliable detections of human parts and objects.

03

Effective integration of human layout and context enhances understanding.

Abstract

This paper aims at one newly raising task in vision and multimedia research: recognizing human actions from still images. Its main challenges lie in the large variations in human poses and appearances, as well as the lack of temporal motion information. Addressing these problems, we propose to develop an expressive deep model to naturally integrate human layout and surrounding contexts for higher level action understanding from still images. In particular, a Deep Belief Net is trained to fuse information from different noisy sources such as body part detection and object detection. To bridge the semantic gap, we used manually labeled data to greatly improve the effectiveness and efficiency of the pre-training and fine-tuning stages of the DBN training. The resulting framework is shown to be robust to sometimes unreliable inputs (e.g., imprecise detections of human parts and objects),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.