On Utilizing Relationships for Transferable Few-Shot Fine-Grained Object Detection
Ambar Pal, Arnau Ramisa, Amit Kumar K C, Ren\'e Vidal

TL;DR
This paper introduces RelDetect, a probabilistic model leveraging common-sense relationships to enhance fine-grained object detection with minimal annotations, outperforming traditional fine-tuning methods especially in zero-shot transfer scenarios.
Contribution
The paper presents a novel relational knowledge-based approach for fine-grained object detection that requires significantly fewer annotations and improves transferability over existing methods.
Findings
RelDetect achieves comparable performance to fine-tuning with only 0.2% annotations.
It outperforms baselines by +5 mAP on unseen datasets.
Utilizes common-sense relationships to enable zero-shot transfer.
Abstract
State-of-the-art object detectors are fast and accurate, but they require a large amount of well annotated training data to obtain good performance. However, obtaining a large amount of training annotations specific to a particular task, i.e., fine-grained annotations, is costly in practice. In contrast, obtaining common-sense relationships from text, e.g., "a table-lamp is a lamp that sits on top of a table", is much easier. Additionally, common-sense relationships like "on-top-of" are easy to annotate in a task-agnostic fashion. In this paper, we propose a probabilistic model that uses such relational knowledge to transform an off-the-shelf detector of coarse object categories (e.g., "table", "lamp") into a detector of fine-grained categories (e.g., "table-lamp"). We demonstrate that our method, RelDetect, achieves performance competitive to finetuning based state-of-the-art object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
