Event-aware Video Corpus Moment Retrieval
Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng

TL;DR
This paper introduces EventFormer, a novel model for Video Corpus Moment Retrieval that explicitly models events within videos, leading to improved accuracy and efficiency over existing frame-aware methods.
Contribution
EventFormer leverages event reasoning and hierarchical encoding to better capture semantic structures, achieving state-of-the-art results in VCMR benchmarks.
Findings
EventFormer outperforms previous methods on TVR, ANetCaps, and DiDeMo datasets.
The model effectively captures event-level information for improved retrieval accuracy.
EventFormer demonstrates efficiency and effectiveness in partially relevant video retrieval tasks.
Abstract
Video Corpus Moment Retrieval (VCMR) is a practical video retrieval task focused on identifying a specific moment within a vast corpus of untrimmed videos using the natural language query. Existing methods for VCMR typically rely on frame-aware video retrieval, calculating similarities between the query and video frames to rank videos based on maximum frame similarity.However, this approach overlooks the semantic structure embedded within the information between frames, namely, the event, a crucial element for human comprehension of videos. Motivated by this, we propose EventFormer, a model that explicitly utilizes events within videos as fundamental units for video retrieval. The model extracts event representations through event reasoning and hierarchical event encoding. The event reasoning module groups consecutive and visually similar frame representations into events, while the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Music and Audio Processing · Multimodal Machine Learning Applications
MethodsLinear Layer · Dense Connections · Label Smoothing · Adam · Attention Is All You Need · Contrastive Learning · Softmax · Multi-Head Attention · Layer Normalization · Dropout
