RemoteVAR: Autoregressive Visual Modeling for Remote Sensing Change Detection
Yilmaz Korkmaz, Vishal M. Patel

TL;DR
RemoteVAR introduces a novel autoregressive visual model for remote sensing change detection, leveraging multi-resolution features and a specialized training strategy to outperform existing diffusion and transformer baselines.
Contribution
It presents a new VAR-based framework that improves pixel-level change detection by conditioning on fused features and employing autoregressive training tailored for this task.
Findings
Outperforms diffusion and transformer-based methods on benchmarks.
Provides a competitive autoregressive alternative for change detection.
Demonstrates significant improvements in dense prediction accuracy.
Abstract
Remote sensing change detection aims to localize and characterize scene changes between two time points and is central to applications such as environmental monitoring and disaster assessment. Meanwhile, visual autoregressive models (VARs) have recently shown impressive image generation capability, but their adoption for pixel-level discriminative tasks remains limited due to weak controllability, suboptimal dense prediction performance and exposure bias. We introduce RemoteVAR, a new VAR-based change detection framework that addresses these limitations by conditioning autoregressive prediction on multi-resolution fused bi-temporal features via cross-attention, and by employing an autoregressive training strategy designed specifically for change map prediction. Extensive experiments on standard change detection benchmarks show that RemoteVAR delivers consistent and significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Remote Sensing in Agriculture · Domain Adaptation and Few-Shot Learning
