Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors

Limin Wang; Yu Qiao; Xiaoou Tang

arXiv:1505.04868·cs.CV·November 17, 2016

Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors

Limin Wang, Yu Qiao, Xiaoou Tang

PDF

1 Repo

TL;DR

This paper introduces trajectory-pooled deep-convolutional descriptors (TDD), a new video feature representation that combines deep learning and trajectory-based pooling to improve human action recognition accuracy.

Contribution

The paper proposes TDD, a novel video descriptor that integrates deep convolutional features with trajectory-constrained pooling and normalization methods, enhancing action recognition performance.

Findings

01

TDD outperforms previous hand-crafted and deep-learned features.

02

Achieves state-of-the-art results on HMDB51 and UCF101 datasets.

03

Demonstrates robustness through normalization techniques.

Abstract

Visual features are of vital importance for human action understanding in videos. This paper presents a new video representation, called trajectory-pooled deep-convolutional descriptor (TDD), which shares the merits of both hand-crafted features and deep-learned features. Specifically, we utilize deep architectures to learn discriminative convolutional feature maps, and conduct trajectory-constrained pooling to aggregate these convolutional features into effective descriptors. To enhance the robustness of TDDs, we design two normalization methods to transform convolutional feature maps, namely spatiotemporal normalization and channel normalization. The advantages of our features come from (i) TDDs are automatically learned and contain high discriminative capacity compared with those hand-crafted features; (ii) TDDs take account of the intrinsic characteristics of temporal dimension and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

damien911224/theWorldInSafety
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.