A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps

Xuanlong Yu; Youyang Sha; Longfei Liu; Xi Shen; Di Yang

arXiv:2603.28182·cs.CV·March 31, 2026

A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps

Xuanlong Yu, Youyang Sha, Longfei Liu, Xi Shen, Di Yang

PDF

1 Repo

TL;DR

This paper introduces a hybrid ensemble decoder and a progressive fine-tuning framework to improve cross-domain few-shot object detection, demonstrating significant performance gains and robustness across multiple datasets.

Contribution

The work proposes a novel ensemble decoder with denoising queries and a plateau-aware fine-tuning schedule, enhancing generalization and stability in FSOD without extra parameters.

Findings

01

Achieves 41.9 average performance on RF100-VL in 10-shot setting, outperforming recent methods.

02

Improves robustness to out-of-distribution samples on a mixed-domain test set.

03

Demonstrates effectiveness across multiple datasets including CD-FSOD, ODinW-13, and RF100-VL.

Abstract

Few-shot object detection (FSOD) is challenging due to unstable optimization and limited generalization arising from the scarcity of training samples. To address these issues, we propose a hybrid ensemble decoder that enhances generalization during fine-tuning. Inspired by ensemble learning, the decoder comprises a shared hierarchical layer followed by multiple parallel decoder branches, where each branch employs denoising queries either inherited from the shared layer or newly initialized to encourage prediction diversity. This design fully exploits pretrained weights without introducing additional parameters, and the resulting diverse predictions can be effectively ensembled to improve generalization. We further leverage a unified progressive fine-tuning framework with a plateau-aware learning rate schedule, which stabilizes optimization and achieves strong few-shot adaptation without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Intellindust-AI-Lab/FT-FSOD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.