# qwLSH: Cache-conscious Indexing for Processing Similarity Search Query   Workloads in High-Dimensional Spaces

**Authors:** Omid Jafari, John Ossorgin, Parth Nagarkar

arXiv: 1907.11803 · 2019-07-30

## TL;DR

qwLSH is a novel indexing technique that improves cache efficiency and query processing speed for similarity search workloads in high-dimensional spaces by using intelligent cache division and cost models.

## Contribution

The paper introduces qwLSH, a cache-conscious index structure that optimizes similarity search workload processing in high-dimensional spaces using novel cost models.

## Key findings

- qwLSH outperforms existing techniques in query workload speed.
- It effectively utilizes cache by dividing cache based on novel cost models.
- Experimental results validate the efficiency of qwLSH.

## Abstract

Similarity search queries in high-dimensional spaces are an important type of queries in many domains such as image processing, machine learning, etc. Since exact similarity search indexing techniques suffer from the well-known curse of dimensionality in high-dimensional spaces, approximate search techniques are often utilized instead. Locality Sensitive Hashing (LSH) has been shown to be an effective approximate search method for solving similarity search queries in high-dimensional spaces. Often times, queries in real-world settings arrive as part of a query workload. LSH and its variants are particularly designed to solve single queries effectively. They suffer from one major drawback while executing query workloads: they do not take into consideration important data characteristics for effective cache utilization while designing the index structures. In this paper, we present qwLSH, an index structure for efficiently processing similarity search query workloads in high-dimensional spaces. We intelligently divide a given cache during processing of a query workload by using novel cost models. Experimental results show that, given a query workload, qwLSH is able to perform faster than existing techniques due to its unique cost models and strategies.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11803/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11803/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1907.11803/full.md

---
Source: https://tomesphere.com/paper/1907.11803