CARLOR @ Ego4D Step Grounding Challenge: Bayesian temporal-order priors   for test time refinement

Carlos Plou; Lorenzo Mur-Labadia; Ruben Martinez-Cantin; Ana; C.Murillo

arXiv:2406.09575·cs.CV·June 17, 2024

CARLOR @ Ego4D Step Grounding Challenge: Bayesian temporal-order priors for test time refinement

Carlos Plou, Lorenzo Mur-Labadia, Ruben Martinez-Cantin, Ana, C.Murillo

PDF

Open Access

TL;DR

This paper presents a Bayesian-VSLNet model that improves step grounding in egocentric videos by incorporating a Bayesian temporal-order prior, achieving state-of-the-art accuracy on the Ego4D dataset.

Contribution

The introduction of a Bayesian temporal-order prior into VSLNet for test-time refinement is a novel approach for better temporal boundary detection in untrimmed videos.

Findings

01

Achieved 35.18% Recall Top-1 at 0.3 IoU on Ego4D dataset.

02

Achieved 20.48% Recall Top-1 at 0.5 IoU on Ego4D dataset.

03

Outperformed existing methods with significant accuracy improvements.

Abstract

The goal of the Step Grounding task is to locate temporal boundaries of activities based on natural language descriptions. This technical report introduces a Bayesian-VSLNet to address the challenge of identifying such temporal segments in lengthy, untrimmed egocentric videos. Our model significantly improves upon traditional models by incorporating a novel Bayesian temporal-order prior during inference, enhancing the accuracy of moment predictions. This prior adjusts for cyclic and repetitive actions within videos. Our evaluations demonstrate superior performance over existing methods, achieving state-of-the-art results on the Ego4D Goal-Step dataset with a 35.18 Recall Top-1 at 0.3 IoU and 20.48 Recall Top-1 at 0.5 IoU on the test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Time Series Analysis and Forecasting · Natural Language Processing Techniques