Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

Ricardo \~Nanculef; Francisco Mena; Antonio Macaluso; Stefano Lodi,; Claudio Sartori

arXiv:2007.08799·cs.LG·August 11, 2023

Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

Ricardo \~Nanculef, Francisco Mena, Antonio Macaluso, Stefano Lodi,, Claudio Sartori

PDF

1 Repo

TL;DR

This paper introduces a semi-supervised Bernoulli autoencoder approach for semantic hashing, improving binary code quality in low-label scenarios by leveraging label predictions, with experiments showing significant gains over existing methods.

Contribution

It proposes a novel semi-supervised training method for Bernoulli autoencoders that enhances hashing performance when labeled data is limited.

Findings

01

Pairwise loss degrades with fewer labels

02

Proposed label distribution-based supervision improves performance in scarce label settings

03

Method achieves comparable results to fully supervised models with less labeled data

Abstract

Semantic hashing is an emerging technique for large-scale similarity search based on representing high-dimensional data using similarity-preserving binary codes used for efficient indexing and search. It has recently been shown that variational autoencoders, with Bernoulli latent representations parametrized by neural nets, can be successfully trained to learn such codes in supervised and unsupervised scenarios, improving on more traditional methods thanks to their ability to handle the binary constraints architecturally. However, the scenario where labels are scarce has not been studied yet. This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. The first augments the variational autoencoder's training objective to jointly model the distribution over the data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amacaluso/SSB-VAE
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.