EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video   Retrieval

Thomas Hummel; Shyamgopal Karthik; Mariana-Iuliana Georgescu; Zeynep; Akata

arXiv:2407.16658·cs.CV·July 24, 2024

EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval

Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu, Zeynep, Akata

PDF

1 Repo

TL;DR

EgoCVR introduces a new egocentric video benchmark for fine-grained composed video retrieval, highlighting the need for better temporal understanding and proposing a re-ranking method that improves retrieval performance.

Contribution

The paper presents EgoCVR, a large-scale egocentric video benchmark for fine-grained retrieval, and proposes a simple, training-free re-ranking framework to enhance retrieval accuracy.

Findings

01

Existing methods lack high-quality temporal understanding.

02

The proposed re-ranking framework significantly improves retrieval results.

03

EgoCVR provides a challenging benchmark for future research.

Abstract

In Composed Video Retrieval, a video and a textual description which modifies the video content are provided as inputs to the model. The aim is to retrieve the relevant video with the modified content from a database of videos. In this challenging task, the first step is to acquire large-scale training datasets and collect high-quality benchmarks for evaluation. In this work, we introduce EgoCVR, a new evaluation benchmark for fine-grained Composed Video Retrieval using large-scale egocentric video datasets. EgoCVR consists of 2,295 queries that specifically focus on high-quality temporal video understanding. We find that existing Composed Video Retrieval frameworks do not achieve the necessary high-quality temporal video understanding for this task. To address this shortcoming, we adapt a simple training-free method, propose a generic re-ranking framework for Composed Video Retrieval,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explainableml/egocvr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus