Multi-Query Video Retrieval

Zeyu Wang; Yu Wu; Karthik Narasimhan; Olga Russakovsky

arXiv:2201.03639·cs.CV·July 22, 2022

Multi-Query Video Retrieval

Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky

PDF

Open Access 1 Repo

TL;DR

This paper explores multi-query video retrieval, demonstrating that using multiple descriptions improves model evaluation, reduces dataset noise impact, and enhances retrieval performance and generalization in practical applications.

Contribution

It introduces the multi-query retrieval setting, showing its effectiveness in mitigating dataset noise and improving model training and evaluation.

Findings

01

Multi-query retrieval correlates better with human judgment.

02

Multi-query training improves model performance.

03

Mitigates dataset annotation noise.

Abstract

Retrieving target videos based on text descriptions is a task of great practical value and has received increasing attention over the past few years. Despite recent progress, imperfect annotations in existing video retrieval datasets have posed significant challenges on model evaluation and development. In this paper, we tackle this issue by focusing on the less-studied setting of multi-query video retrieval, where multiple descriptions are provided to the model for searching over the video archive. We first show that multi-query retrieval task effectively mitigates the dataset noise introduced by imperfect annotations and better correlates with human judgement on evaluating retrieval abilities of current models. We then investigate several methods which leverage multiple queries at training time, and demonstrate that the multi-query inspired training can lead to superior performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

princetonvisualai/mqvr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning