Robust Scene Change Detection Using Visual Foundation Models and   Cross-Attention Mechanisms

Chun-Jung Lin; Sourav Garg; Tat-Jun Chin; Feras Dayoub

arXiv:2409.16850·cs.CV·March 5, 2025

Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms

Chun-Jung Lin, Sourav Garg, Tat-Jun Chin, Feras Dayoub

PDF

Open Access 1 Repo

TL;DR

This paper introduces a scene change detection method that combines a visual foundation model with cross-attention mechanisms, achieving robustness against lighting, seasonal, and viewpoint variations, and demonstrating superior performance on benchmark datasets.

Contribution

The paper proposes a novel approach that uses a frozen backbone and full-image cross-attention for improved scene change detection, enhancing generalization and robustness.

Findings

01

Significant F1-score improvements on benchmark datasets.

02

Robustness against photometric and geometric variations.

03

Superior generalization over existing methods.

Abstract

We present a novel method for scene change detection that leverages the robust feature extraction capabilities of a visual foundational model, DINOv2, and integrates full-image cross-attention to address key challenges such as varying lighting, seasonal variations, and viewpoint differences. In order to effectively learn correspondences and mis-correspondences between an image pair for the change detection task, we propose to a) ``freeze'' the backbone in order to retain the generality of dense foundation features, and b) employ ``full-image'' cross-attention to better tackle the viewpoint variations between the image pair. We evaluate our approach on two benchmark datasets, VL-CMU-CD and PSCD, along with their viewpoint-varied versions. Our experiments demonstrate significant improvements in F1-score, particularly in scenarios involving geometric changes between image pairs. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ChadLin9596/Robust-Scene-Change-Detection
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques