Few-Shot Object Detection via Spatial-Channel State Space Model

Zhimeng Xin; Tianxu Wu; Yixiong Zou; Shiming Chen; Dingjie Fu; and Xinge You

arXiv:2507.15308·cs.CV·July 22, 2025

Few-Shot Object Detection via Spatial-Channel State Space Model

Zhimeng Xin, Tianxu Wu, Yixiong Zou, Shiming Chen, Dingjie Fu, and Xinge You

PDF

TL;DR

This paper introduces a novel Spatial-Channel State Space Model (SCSM) for few-shot object detection that leverages inter-channel correlation to enhance feature representation and improve detection accuracy.

Contribution

The paper proposes the SCSM module, combining spatial and channel state modeling with Mamba-based correlation learning, to better highlight effective features in few-shot detection.

Findings

01

Achieves state-of-the-art results on VOC and COCO datasets.

02

Improves feature focus and detection accuracy in few-shot scenarios.

03

Demonstrates the effectiveness of inter-channel correlation modeling.

Abstract

Due to the limited training samples in few-shot object detection (FSOD), we observe that current methods may struggle to accurately extract effective features from each channel. Specifically, this issue manifests in two aspects: i) channels with high weights may not necessarily be effective, and ii) channels with low weights may still hold significant value. To handle this problem, we consider utilizing the inter-channel correlation to facilitate the novel model's adaptation process to novel conditions, ensuring the model can correctly highlight effective channels and rectify those incorrect ones. Since the channel sequence is also 1-dimensional, its similarity with the temporal sequence inspires us to take Mamba for modeling the correlation in the channel sequence. Based on this concept, we propose a Spatial-Channel State Space Modeling (SCSM) module for spatial-channel state modeling,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.