Evolution-Preserving Dense Trajectory Descriptors

Yang Wang; Vinh Tran; Minh Hoai

arXiv:1702.04037·cs.CV·February 15, 2017·2 cites

Evolution-Preserving Dense Trajectory Descriptors

Yang Wang, Vinh Tran, Minh Hoai

PDF

Open Access

TL;DR

This paper introduces Evolution-Preserving Trajectory (EPT) descriptors, which encode the temporal evolution of deep features along trajectories, significantly improving human action recognition performance.

Contribution

The paper proposes a novel EPT descriptor that applies rank pooling to dense trajectories, enhancing the encoding of temporal evolution in video analysis.

Findings

01

EPT descriptors outperform previous trajectory-pooled deep descriptors

02

Combining EPT with VideoDarwin achieves state-of-the-art results

03

EPT provides complementary benefits to non-trajectory-based descriptors

Abstract

Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets. This paper improves their performance by applying rank pooling to each trajectory, encoding the temporal evolution of deep learning features computed along the trajectory. This leads to Evolution-Preserving Trajectory (EPT) descriptors, a novel type of video descriptor that significantly outperforms Trajectory-pooled Deep-learning Descriptors. EPT descriptors are defined based on dense trajectories, and they provide complimentary benefits to video descriptors that are not based on trajectories. In particular, we show that the combination of EPT descriptors and VideoDarwin leads to state-of-the-art performance on Hollywood2 and UCF101 datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications