VSCD: Video-based Scene Change Detection in Unaligned Scenes
Jiae Yoon, Ue-Hwan Kim

TL;DR
This paper introduces VSCD, a novel method for pixel-wise scene change detection in unconstrained indoor videos with camera motion, supported by a large annotated benchmark and real-world robot deployment.
Contribution
The paper presents a new approach for scene change detection that handles unaligned, moving camera videos, along with a large-scale benchmark dataset and real-world robot application validation.
Findings
Achieves state-of-the-art performance on the new benchmark.
Effectively detects scene changes despite camera motion and object appearance/disappearance.
Validates real-world applicability through deployment on a mobile robot.
Abstract
Detecting what has changed in an environment is essential for long-term autonomy, yet most change detection settings assume fixed viewpoints, mild misalignment, or only a few changed objects. We introduce Video-based Scene Change Detection (VSCD), which predicts a pixel-wise change mask for each query frame, given a reference and a query RGB video of the same indoor space recorded at different times under unconstrained camera motion. The two videos are not temporally synchronized, and many object instances may appear or disappear. To study this setting, we build a large-scale benchmark with over 1.1 million frames annotated with pixel-accurate change masks, together with a real-world test set for evaluating transfer beyond simulation. We propose a query-centric multi-reference model that learns temporal matching implicitly from change-mask supervision, aligns candidate reference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
