Efficient Few-Shot Object Detection via Knowledge Inheritance
Ze Yang, Chi Zhang, Ruibo Li, Yi Xu, Guosheng Lin

TL;DR
This paper introduces an efficient few-shot object detection framework that achieves state-of-the-art accuracy while significantly reducing adaptation time, addressing the critical efficiency challenge in embedded AI applications.
Contribution
It proposes a novel pretrain-transfer framework with a knowledge inheritance initializer and adaptive length re-scaling, enabling fast and effective adaptation in few-shot object detection.
Findings
Achieves SOTA results on PASCAL VOC, COCO, and LVIS benchmarks.
Exhibits 1.8-100x faster adaptation speed compared to existing methods.
Maintains competitive accuracy with no additional computational cost.
Abstract
Few-shot object detection (FSOD), which aims at learning a generic detector that can adapt to unseen tasks with scarce training samples, has witnessed consistent improvement recently. However, most existing methods ignore the efficiency issues, e.g., high computational complexity and slow adaptation speed. Notably, efficiency has become an increasingly important evaluation metric for few-shot techniques due to an emerging trend toward embedded AI. To this end, we present an efficient pretrain-transfer framework (PTF) baseline with no computational increment, which achieves comparable results with previous state-of-the-art (SOTA) methods. Upon this baseline, we devise an initializer named knowledge inheritance (KI) to reliably initialize the novel weights for the box classifier, which effectively facilitates the knowledge transfer process and boosts the adaptation speed. Within the KI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Balanced Selection
