Joint Learning On The Hierarchy Representation for Fine-Grained Human   Action Recognition

Mei Chee Leong; Hui Li Tan; Haosong Zhang; Liyuan Li; Feng Lin; Joo; Hwee Lim

arXiv:2110.05853·cs.CV·October 13, 2021

Joint Learning On The Hierarchy Representation for Fine-Grained Human Action Recognition

Mei Chee Leong, Hui Li Tan, Haosong Zhang, Liyuan Li, Feng Lin, Joo, Hwee Lim

PDF

TL;DR

This paper introduces a multi-task network leveraging hierarchy representations for fine-grained human action recognition, achieving state-of-the-art accuracy on the FineGym dataset.

Contribution

It proposes a novel multi-task learning framework that exploits hierarchical action representations with a three-pathway network and integration layers.

Findings

01

Achieved 91.80% Top-1 accuracy on FineGym dataset.

02

Outperformed previous methods by 3.40% in Top-1 accuracy.

03

Demonstrated effective joint learning of hierarchical action features.

Abstract

Fine-grained human action recognition is a core research topic in computer vision. Inspired by the recently proposed hierarchy representation of fine-grained actions in FineGym and SlowFast network for action recognition, we propose a novel multi-task network which exploits the FineGym hierarchy representation to achieve effective joint learning and prediction for fine-grained human action recognition. The multi-task network consists of three pathways of SlowOnly networks with gradually increased frame rates for events, sets and elements of fine-grained actions, followed by our proposed integration layers for joint learning and prediction. It is a two-stage approach, where it first learns deep feature representation at each hierarchical level, and is followed by feature encoding and fusion for multi-task learning. Our empirical results on the FineGym dataset achieve a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.