TL;DR
This paper introduces a novel transductive zero-shot hashing method for multi-label image retrieval, effectively handling unseen multi-label images by building a visual-semantic bridge and leveraging both seen and unseen data.
Contribution
It is the first to address multi-label zero-shot hashing with a transductive approach, improving retrieval performance on unseen multi-label images.
Findings
Significantly outperforms existing methods on three datasets.
Effective label prediction for unseen multi-label images.
Improved retrieval accuracy with the proposed approach.
Abstract
Hash coding has been widely used in approximate nearest neighbor search for large-scale image retrieval. Given semantic annotations such as class labels and pairwise similarities of the training data, hashing methods can learn and generate effective and compact binary codes. While some newly introduced images may contain undefined semantic labels, which we call unseen images, zeor-shot hashing techniques have been studied. However, existing zeor-shot hashing methods focus on the retrieval of single-label images, and cannot handle multi-label images. In this paper, for the first time, a novel transductive zero-shot hashing method is proposed for multi-label unseen image retrieval. In order to predict the labels of the unseen/target data, a visual-semantic bridge is built via instance-concept coherence ranking on the seen/source data. Then, pairwise similarity loss and focal quantization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
