Egocentric Video Task Translation @ Ego4D Challenge 2022

Zihui Xue; Yale Song; Kristen Grauman; Lorenzo Torresani

arXiv:2302.01891·cs.CV·February 6, 2023·1 cites

Egocentric Video Task Translation @ Ego4D Challenge 2022

Zihui Xue, Yale Song, Kristen Grauman, Lorenzo Torresani

PDF

Open Access

TL;DR

This paper introduces EgoTask Translation, a method leveraging related task models to improve primary egocentric video task performance, achieving top rankings in Ego4D challenges without modifying baseline architectures.

Contribution

The paper presents a novel task translator that learns to translate auxiliary task features to enhance primary task performance in egocentric video analysis.

Findings

01

Achieved 1st place in talking to me challenge

02

Secured 3rd place in PNR keyframe localization challenge

03

Demonstrated competitive performance without modifying baseline models

Abstract

This technical report describes the EgoTask Translation approach that explores relations among a set of egocentric video tasks in the Ego4D challenge. To improve the primary task of interest, we propose to leverage existing models developed for other related tasks and design a task translator that learns to ''translate'' auxiliary task features to the primary task. With no modification to the baseline architectures, our proposed approach achieves competitive performance on two Ego4D challenges, ranking the 1st in the talking to me challenge and the 3rd in the PNR keyframe localization challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Video Analysis and Summarization