SceneDiff: A Benchmark and Method for Multiview Object Change Detection
Yuqun Wu, Chih-hao Lin, Henry Che, Aditi Tiwari, Chuhang Zou, Shenlong Wang, Derek Hoiem

TL;DR
SceneDiff introduces a new multiview change detection benchmark and a training-free method that leverages pretrained models to identify object changes across different camera views.
Contribution
The paper presents the first multiview change detection dataset and a novel training-free algorithm that generalizes well without retraining.
Findings
SceneDiff outperforms existing methods with 53.0% and 30.6% relative AP improvements.
The dataset includes 350 diverse video pairs with dense object annotations.
The method leverages pretrained models for cross-domain generalization.
Abstract
We investigate the problem of identifying objects that have been added, removed, or moved between a pair of captures (images or videos) of the same scene at different times. Accurately identifying verifiable changes is extremely challenging -- some objects may appear to be missing because they are occluded or out of frame, while others may appear different due to large viewpoint changes. To study this problem, we introduce the SceneDiff Benchmark, the first multiview change detection dataset for scenes captured along different camera trajectories, comprising 350 diverse video pairs with dense object instance-level annotations. We also introduce the SceneDiff algorithm, a training-free approach that solves for image poses, segments images into objects, and compares them using semantic and geometric features. By building on pretrained models, SceneDiff generalizes across domains without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
