Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge   2022

Jiachen Lei; Shuang Ma; Zhongjie Ba; Sai Vemprala; Ashish Kapoor and; Kui Ren

arXiv:2211.15286·cs.CV·November 29, 2022·1 cites

Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022

Jiachen Lei, Shuang Ma, Zhongjie Ba, Sai Vemprala, Ashish Kapoor and, Kui Ren

PDF

Open Access 1 Repo

TL;DR

This paper describes the application of masked autoencoders to egocentric video understanding tasks, achieving high rankings in the Ego4D Challenge 2022, with results on object state change classification and temporal localization.

Contribution

The paper introduces the use of masked autoencoders for egocentric video tasks and demonstrates their effectiveness through empirical results in a competitive challenge.

Findings

01

Ranked 2nd in Object State Change Classification

02

Ranked 2nd in PNR Temporal Localization

03

Code will be publicly available

Abstract

In this report, we present our approach and empirical results of applying masked autoencoders in two egocentric video understanding tasks, namely, Object State Change Classification and PNR Temporal Localization, of Ego4D Challenge 2022. As team TheSSVL, we ranked 2nd place in both tasks. Our code will be made available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jasonrayshd/egomotion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Machine Learning in Healthcare