VFM-Guided Semi-Supervised Detection Transformer under Source-Free Constraints for Remote Sensing Object Detection
Jianhong Han, Yupei Wang, Liang Chen

TL;DR
This paper introduces VG-DETR, a semi-supervised, source-free remote sensing object detection method that leverages a vision foundation model to improve pseudo-label quality and feature robustness without access to source data.
Contribution
It proposes a novel VFM-guided semi-supervised framework for source-free remote sensing detection, integrating semantic priors and dual-level alignment to enhance performance.
Findings
VG-DETR outperforms existing methods in remote sensing detection tasks.
The VFM-guided pseudo-label mining improves label accuracy and quantity.
Dual-level alignment enhances feature robustness against domain gaps.
Abstract
Unsupervised domain adaptation methods have been widely explored to bridge domain gaps. However, in real-world remote-sensing scenarios, privacy and transmission constraints often preclude access to source domain data, which limits their practical applicability. Recently, Source-Free Object Detection (SFOD) has emerged as a promising alternative, aiming at cross-domain adaptation without relying on source data, primarily through a self-training paradigm. Despite its potential, SFOD frequently suffers from training collapse caused by noisy pseudo-labels, especially in remote sensing imagery with dense objects and complex backgrounds. Considering that limited target domain annotations are often feasible in practice, we propose a Vision foundation-Guided DEtection TRansformer (VG-DETR), built upon a semi-supervised framework for SFOD in remote sensing images. VG-DETR integrates a Vision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
