VSCD: Video-based Scene Change Detection in Unaligned Scenes

Jiae Yoon; Ue-Hwan Kim

arXiv:2605.20821·cs.CV·May 21, 2026

VSCD: Video-based Scene Change Detection in Unaligned Scenes

Jiae Yoon, Ue-Hwan Kim

PDF

TL;DR

This paper introduces VSCD, a novel method for pixel-wise scene change detection in unconstrained indoor videos with camera motion, supported by a large annotated benchmark and real-world robot deployment.

Contribution

The paper presents a new approach for scene change detection that handles unaligned, moving camera videos, along with a large-scale benchmark dataset and real-world robot application validation.

Findings

01

Achieves state-of-the-art performance on the new benchmark.

02

Effectively detects scene changes despite camera motion and object appearance/disappearance.

03

Validates real-world applicability through deployment on a mobile robot.

Abstract

Detecting what has changed in an environment is essential for long-term autonomy, yet most change detection settings assume fixed viewpoints, mild misalignment, or only a few changed objects. We introduce Video-based Scene Change Detection (VSCD), which predicts a pixel-wise change mask for each query frame, given a reference and a query RGB video of the same indoor space recorded at different times under unconstrained camera motion. The two videos are not temporally synchronized, and many object instances may appear or disappear. To study this setting, we build a large-scale benchmark with over 1.1 million frames annotated with pixel-accurate change masks, together with a real-world test set for evaluating transfer beyond simulation. We propose a query-centric multi-reference model that learns temporal matching implicitly from change-mask supervision, aligns candidate reference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.