Composite Correlation Quantization for Efficient Multimodal Retrieval
Mingsheng Long, Yue Cao, Jianmin Wang, Philip S. Yu

TL;DR
This paper introduces Composite Correlation Quantization (CCQ), a novel multimodal hashing method that efficiently learns isomorphic representations and compact binary codes for improved cross-modal retrieval accuracy.
Contribution
CCQ jointly learns correlation-maximal mappings and composite quantizers, enabling seamless, accurate, and efficient multimodal hashing from both paired and partially paired data.
Findings
CCQ outperforms state-of-the-art hashing methods in accuracy.
CCQ achieves linear-time training for large-scale data.
CCQ effectively handles both unimodal and cross-modal retrieval tasks.
Abstract
Efficient similarity retrieval from large-scale multimodal database is pervasive in modern search engines and social networks. To support queries across content modalities, the system should enable cross-modal correlation and computation-efficient indexing. While hashing methods have shown great potential in achieving this goal, current attempts generally fail to learn isomorphic hash codes in a seamless scheme, that is, they embed multiple modalities in a continuous isomorphic space and separately threshold embeddings into binary codes, which incurs substantial loss of retrieval accuracy. In this paper, we approach seamless multimodal hashing by proposing a novel Composite Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds correlation-maximal mappings that transform different modalities into isomorphic latent space, and learns composite quantizers that convert the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
