High-level Codes and Fine-grained Weights for Online Multi-modal Hashing Retrieval
Yu-Wei Zhan, Xiao-Ming Wu, Xin Luo, Yinwei Wei, Xin-Shun Xu

TL;DR
This paper introduces HCFW, a novel online multi-modal hashing method that improves long-term hash code consistency and multi-modal data fusion using high-level codes and fine-grained weights, demonstrated through extensive experiments.
Contribution
HCFW is the first online multi-modal hashing approach to incorporate category-level high-level codes and instance-level fine-grained weights for improved performance.
Findings
HCFW achieves superior retrieval accuracy on benchmark datasets.
The method maintains hash code consistency over long-term incremental learning.
HCFW effectively fuses multi-modal data at the instance level.
Abstract
In the real world, multi-modal data often appears in a streaming fashion, and there is a growing demand for similarity retrieval from such non-stationary data, especially at a large scale. In response to this need, online multi-modal hashing has gained significant attention. However, existing online multi-modal hashing methods face challenges related to the inconsistency of hash codes during long-term learning and inefficient fusion of different modalities. In this paper, we present a novel approach to supervised online multi-modal hashing, called High-level Codes, Fine-grained Weights (HCFW). To address these problems, HCFW is designed by its non-trivial contributions from two primary dimensions: 1) Online Hashing Perspective. To ensure the long-term consistency of hash codes, especially in incremental learning scenarios, HCFW learns high-level codes derived from category-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Multimodal Machine Learning Applications
