Polarity Loss for Zero-shot Object Detection
Shafin Rahman, Salman Khan, and Nick Barnes

TL;DR
This paper introduces Polarity loss, a novel training objective that enhances zero-shot object detection by improving visual-semantic alignment and discrimination, leading to better recognition of unseen objects.
Contribution
It proposes Polarity loss, a new loss function that refines semantic embeddings and maximizes prediction gaps to improve zero-shot detection performance.
Findings
Significant improvements on MS-COCO and Pascal VOC datasets.
Enhanced alignment between visual features and semantic descriptions.
Better discrimination between seen, unseen, and background objects.
Abstract
Conventional object detection models require large amounts of training data. In comparison, humans can recognize previously unseen objects by merely knowing their semantic description. To mimic similar behaviour, zero-shot object detection aims to recognize and localize 'unseen' object instances by using only their semantic information. The model is first trained to learn the relationships between visual and semantic domains for seen objects, later transferring the acquired knowledge to totally unseen objects. This setting gives rise to the need for correct alignment between visual and semantic concepts, so that the unseen objects can be identified using only their semantic attributes. In this paper, we propose a novel loss function called 'Polarity loss', that promotes correct visual-semantic alignment for an improved zero-shot object detection. On one hand, it refines the noisy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
