TL;DR
OpenDPR introduces a novel, training-free framework for open-vocabulary change detection in remote sensing imagery, combining diffusion-guided prototype retrieval with weakly supervised change localization.
Contribution
It proposes a vision-centric, diffusion-guided prototype retrieval method and a weakly supervised change detection module to improve open-vocabulary change detection accuracy.
Findings
Achieves state-of-the-art performance on four benchmark datasets.
Effectively combines foundation models with diffusion models for prototype generation.
Enhances change localization with minimal supervision.
Abstract
Open-vocabulary change detection (OVCD) seeks to recognize arbitrary changes of interest by enabling generalization beyond a fixed set of predefined classes. We reformulate OVCD as a two-stage pipeline: first generate class-agnostic change proposals using visual foundation models (VFMs) such as SAM and DINOv2, and then perform category identification with vision-language models (VLMs) such as CLIP. We reveal that category identification errors are the primary bottleneck of OVCD, mainly due to the limited ability of VLMs based on image-text matching to represent fine-grained land-cover categories. To address this, we propose OpenDPR, a training-free vision-centric diffusion-guided prototype retrieval framework. OpenDPR leverages diffusion models to construct diverse prototypes for target categories offline, and to perform similarity retrieval with change proposals in the visual space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
