# Locality-sensitive hashing of curves

**Authors:** Anne Driemel, Francesco Silvestri

arXiv: 1703.04040 · 2017-03-14

## TL;DR

This paper introduces the first locality-sensitive hashing schemes for polygonal curves under discrete Fréchet and dynamic time warping distances, enabling efficient approximate nearest neighbor searches in high-dimensional curve datasets.

## Contribution

It develops novel LSH schemes for curve similarity measures, addressing alignment optimization challenges, and provides trade-offs between accuracy and performance for near-neighbor data structures.

## Key findings

- Achieves near-linear space and query time for certain curve complexities.
- Provides approximation guarantees with adjustable parameters.
- Improves upon previous results for short curves in unconstrained alignments.

## Abstract

We study data structures for storing a set of polygonal curves in ${\rm R}^d$ such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fr\'echet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk from 2002 for short curves. Let $n$ be the number of input curves and let $m$ be the maximum complexity of a curve in the input. In the particular case where $m \leq \frac{\alpha}{4d} \log n$, for some fixed $\alpha>0$, our solutions imply an approximate near-neighbor data structure for the discrete Fr\'echet distance that uses space in $O(n^{1+\alpha}\log n)$ and achieves query time in $O(n^{\alpha}\log^2 n)$ and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter $k \in [m]$, we can give a data structure that uses space in $O(2^{2k}m^{k-1} n \log n + nm)$, answers queries in $O( 2^{2k} m^{k}\log n)$ time and achieves approximation factor in $O(m/k)$.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.04040/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1703.04040/full.md

---
Source: https://tomesphere.com/paper/1703.04040