The Bitwise Hashing Trick for Personalized Search

Braddock Gaskill

arXiv:1910.08646·cs.IR·October 22, 2019

The Bitwise Hashing Trick for Personalized Search

Braddock Gaskill

PDF

TL;DR

This paper introduces a bitwise hashing technique for personalized search that significantly reduces data size and computation time while maintaining or improving relevance accuracy.

Contribution

It presents a novel use of feature bit vectors with the hashing trick for efficient lexical comparison in personalization tasks.

Findings

01

Order of magnitude reduction in data structure size.

02

Significant decrease in compute time.

03

Maintained or improved relevance quality.

Abstract

Many real world problems require fast and efficient lexical comparison of large numbers of short text strings. Search personalization is one such domain. We introduce the use of feature bit vectors using the hashing trick for improving relevance in personalized search and other personalization applications. We present results of several lexical hashing and comparison methods. These methods are applied to a user's historical behavior and are used to predict future behavior. Using a single bit per dimension instead of floating point results in an order of magnitude decrease in data structure size, while preserving or even improving quality. We use real data to simulate a search personalization task. A simple method for combining bit vectors demonstrates an order of magnitude improvement in compute time on the task with only a small decrease in accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.