Random Forests Can Hash

Qiang Qiu; Guillermo Sapiro; Alex Bronstein

arXiv:1412.5083·cs.CV·April 20, 2015·ICLR·2 cites

Random Forests Can Hash

Qiang Qiu, Guillermo Sapiro, Alex Bronstein

PDF

Open Access

TL;DR

This paper presents a novel random forest-based semantic hashing method that enforces hash consistency within trees and uses information theory for code aggregation, significantly improving large-scale data retrieval performance.

Contribution

It introduces a subspace model for splitting functions and an information-theoretic code aggregation approach for random forests in hashing applications.

Findings

01

Outperforms state-of-the-art hashing methods on large datasets

02

Enforces hash consistency for data from the same class within trees

03

Produces near-optimal class-specific hash codes

Abstract

Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Algorithms and Data Compression · Machine Learning and Data Classification