CroBIM-V: Memory-Quality Controlled Remote Sensing Referring Video Object Segmentation
H. Jiang, Y. Sun, Z. Dong, T. Liu, Y. Gu

TL;DR
This paper introduces RS-RVOS Bench, a large-scale dataset for remote sensing video referring object segmentation, and proposes MQC-SAM, a memory-quality-aware segmentation framework that improves accuracy by filtering unreliable information.
Contribution
The paper provides the first large-scale RS-RVOS benchmark and a novel memory management framework that enhances segmentation accuracy and robustness in dynamic remote sensing videos.
Findings
MQC-SAM achieves state-of-the-art performance on RS-RVOS Bench.
The dataset adopts causality-aware annotations for more realistic scenarios.
Memory quality control effectively prevents error propagation.
Abstract
Remote sensing video referring object segmentation (RS-RVOS) is challenged by weak target saliency and severe visual information truncation in dynamic scenes, making it extremely difficult to maintain discriminative target representations during segmentation. Moreover, progress in this field is hindered by the absence of large-scale dedicated benchmarks, while existing models are often affected by biased initial memory construction that impairs accurate instance localization in complex scenarios, as well as indiscriminate memory accumulation that encodes noise from occlusions or misclassifications, leading to persistent error propagation. This paper advances RS-RVOS research through dual contributions in data and methodology. First, we construct RS-RVOS Bench, the first large-scale benchmark comprising 111 video sequences, about 25,000 frames, and 213,000 temporal referring annotations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
