DOCK: Detecting Objects by transferring Common-sense Knowledge
Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, Yong Jae Lee

TL;DR
This paper introduces DOCK, a scalable object detection method that transfers common-sense knowledge at the region level from source to target categories, significantly improving detection performance on MS COCO.
Contribution
The novel approach uses region-level similarity and leverages automatically acquired common-sense cues to enhance transfer learning in object detection.
Findings
Common-sense knowledge improves detection accuracy.
Region-level similarity outperforms image-level similarity.
Significant performance gains on MS COCO dataset.
Abstract
We present a scalable approach for Detecting Objects by transferring Common-sense Knowledge (DOCK) from source to target categories. In our setting, the training data for the source categories have bounding box annotations, while those for the target categories only have image-level annotations. Current state-of-the-art approaches focus on image-level visual or semantic similarity to adapt a detector trained on the source categories to the new target categories. In contrast, our key idea is to (i) use similarity not at the image-level, but rather at the region-level, and (ii) leverage richer common-sense (based on attribute, spatial, etc.) to guide the algorithm towards learning the correct detections. We acquire such common-sense cues automatically from readily-available knowledge bases without any extra human effort. On the challenging MS COCO dataset, we find that common-sense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
