Image Hashing via Cross-View Code Alignment in the Age of Foundation Models

Ilyass Moummad; Kawtar Zaher; Herv\'e Go\"eau; Alexis Joly

arXiv:2510.27584·cs.CV·April 7, 2026

Image Hashing via Cross-View Code Alignment in the Age of Foundation Models

Ilyass Moummad, Kawtar Zaher, Herv\'e Go\"eau, Alexis Joly

PDF

TL;DR

CroVCA introduces a simple, efficient hashing method using cross-view code alignment with a lightweight network, achieving state-of-the-art results rapidly across large-scale retrieval benchmarks.

Contribution

A unified, fast hashing approach leveraging cross-view alignment and a lightweight network, enabling quick training and broad applicability in large-scale retrieval tasks.

Findings

01

State-of-the-art results achieved in just 5 epochs.

02

Unsupervised hashing on COCO completes in under 2 minutes.

03

Supervised hashing on ImageNet100 completes in about 3 minutes.

Abstract

Efficient large-scale retrieval requires representations that are both compact and discriminative. Foundation models provide powerful visual and multimodal embeddings, but nearest neighbor search in these high-dimensional spaces is computationally expensive. Hashing offers an efficient alternative by enabling fast Hamming distance search with binary codes, yet existing approaches often rely on complex pipelines, multi-term objectives, designs specialized for a single learning paradigm, and long training times. We introduce CroVCA (Cross-View Code Alignment), a simple and unified principle for learning binary codes that remain consistent across semantically aligned views. A single binary cross-entropy loss enforces alignment, while coding-rate maximization serves as an anti-collapse regularizer to promote balanced and diverse codes. To implement this, we design HashCoder, a lightweight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.