Imagine Before Concentration: Diffusion-Guided Registers Enhance Partially Relevant Video Retrieval

Jun Li; Xuhang Lou; Jinpeng Wang; Yuting Wang; Yaowei Wang; Shu-Tao Xia; Bin Chen

arXiv:2604.03653·cs.CV·April 7, 2026

Imagine Before Concentration: Diffusion-Guided Registers Enhance Partially Relevant Video Retrieval

Jun Li, Xuhang Lou, Jinpeng Wang, Yuting Wang, Yaowei Wang, Shu-Tao Xia, Bin Chen

PDF

1 Repo

TL;DR

DreamPRVR introduces a diffusion-guided, coarse-to-fine approach for partially relevant video retrieval, improving global context understanding and cross-modal matching accuracy.

Contribution

It proposes a novel diffusion-based method to generate and refine global semantic registers for better video-text retrieval performance.

Findings

01

Outperforms state-of-the-art PRVR methods on benchmark datasets.

02

Effectively models global context with diffusion-guided semantic registers.

03

Enhances cross-modal matching through register-augmented attention.

Abstract

Partially Relevant Video Retrieval (PRVR) aims to retrieve untrimmed videos based on text queries that describe only partial events. Existing methods suffer from incomplete global contextual perception, struggling with query ambiguity and local noise induced by spurious responses. To address these issues, we propose DreamPRVR, which adopts a coarse-to-fine representation learning paradigm. The model first generates global contextual semantic registers as coarse-grained highlights spanning the entire video and then concentrates on fine-grained similarity optimization for precise cross-modal matching. Concretely, these registers are generated by initializing from the video-centric distribution produced by a probabilistic variational sampler and then iteratively refined via a text-supervised truncated diffusion model. During this process, textual semantic structure learning constructs a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lijun2005/CVPR26-DreamPRVR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.