Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid   Model

Bo Pang; Kaiwen Zha; Cewu Lu

arXiv:1802.01144·cs.CV·February 13, 2018·1 cites

Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

Bo Pang, Kaiwen Zha, Cewu Lu

PDF

Open Access

TL;DR

This paper introduces the first benchmark dataset for recognizing human action adverbs, analyzes existing models' limitations, and proposes a novel three-stream hybrid model that improves recognition performance.

Contribution

It presents the ADHA dataset for human action adverb recognition and introduces a new three-stream hybrid model that outperforms existing methods.

Findings

01

Existing models perform inadequately on adverb recognition.

02

The proposed three-stream hybrid model achieves better accuracy.

03

The ADHA dataset enables future research in this area.

Abstract

We introduce the first benchmark for a new problem --- recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). This is the first step for computer vision to change over from pattern recognition to real AI. We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labeling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results show that such methods are unsatisfactory. Moreover, we propose a novel three-stream hybrid model to deal the HAA problem, which achieves a better result.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning