Cos R-CNN for Online Few-shot Object Detection
Gratianus Wesley Putra Data, Henry Howard-Jenkins, David Murray,, Victor Prisacariu

TL;DR
Cos R-CNN introduces a cosine similarity-based approach for online few-shot object detection, enabling rapid adaptation to new classes without fine-tuning and outperforming existing methods on benchmark datasets.
Contribution
It presents a novel exemplar-based R-CNN framework utilizing cosine similarity for effective online few-shot detection without fine-tuning.
Findings
Achieves state-of-the-art results on 5-way ImageNet few-shot detection benchmark.
Outperforms existing methods by over 8% in 1-shot, 3% in 5-shot, and 1% in 10-shot scenarios.
Improves performance by up to 20% on novel classes in VOC dataset.
Abstract
We propose Cos R-CNN, a simple exemplar-based R-CNN formulation that is designed for online few-shot object detection. That is, it is able to localise and classify novel object categories in images with few examples without fine-tuning. Cos R-CNN frames detection as a learning-to-compare task: unseen classes are represented as exemplar images, and objects are detected based on their similarity to these exemplars. The cosine-based classification head allows for dynamic adaptation of classification parameters to the exemplar embedding, and encourages the clustering of similar classes in embedding space without the need for manual tuning of distance-metric hyperparameters. This simple formulation achieves best results on the recently proposed 5-way ImageNet few-shot detection benchmark, beating the online 1/5/10-shot scenarios by more than 8/3/1%, as well as performing up to 20% better in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
