Developing Motion Code Embedding for Action Recognition in Videos

Maxat Alibayev; David Paulius; and Yu Sun

arXiv:2012.05438·cs.CV·August 18, 2021

Developing Motion Code Embedding for Action Recognition in Videos

Maxat Alibayev, David Paulius, and Yu Sun

PDF

TL;DR

This paper introduces motion codes, a vectorized motion representation based on salient mechanical attributes, integrated into action recognition models to improve accuracy in egocentric videos.

Contribution

The paper presents a novel motion embedding strategy called motion codes, combining visual and semantic features for enhanced action recognition.

Findings

01

Achieved higher verb classification accuracy on EPIC-KITCHENS dataset

02

Demonstrated robustness of motion codes as features for machine learning

03

Integrated motion codes into state-of-the-art models successfully

Abstract

In this work, we propose a motion embedding strategy known as motion codes, which is a vectorized representation of motions based on a manipulation's salient mechanical attributes. These motion codes provide a robust motion representation, and they are obtained using a hierarchy of features called the motion taxonomy. We developed and trained a deep neural network model that combines visual and semantic features to identify the features found in our motion taxonomy to embed or annotate videos with motion codes. To demonstrate the potential of motion codes as features for machine learning tasks, we integrated the extracted features from the motion embedding model into the current state-of-the-art action recognition model. The obtained model achieved higher accuracy than the baseline model for the verb classification task on egocentric videos from the EPIC-KITCHENS dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.