# Approximate Nearest Neighbor for Curves: Simple, Efficient, and   Deterministic

**Authors:** Arnold Filtser, Omrit Filtser, Matthew J. Katz

arXiv: 1902.07562 · 2022-01-12

## TL;DR

This paper introduces a simple, deterministic data structure for approximate nearest neighbor search among curves under discrete Fréchet or dynamic time warping distances, significantly improving efficiency and storage over previous methods.

## Contribution

The authors present a novel, simple, and deterministic approach for ANNC that drastically reduces query time and storage space compared to prior work.

## Key findings

- Achieves exponential improvement in query time and storage space.
- Supports asymmetric query lengths with similar efficiency.
- Extends to approximate range counting for curves.

## Abstract

In the $(1+\varepsilon,r)$-approximate near-neighbor problem for curves (ANNC) under some distance measure $\delta$, the goal is to construct a data structure for a given set $\mathcal{C}$ of curves that supports approximate near-neighbor queries: Given a query curve $Q$, if there exists a curve $C\in\mathcal{C}$ such that $\delta(Q,C)\le r$, then return a curve $C'\in\mathcal{C}$ with $\delta(Q,C')\le(1+\varepsilon)r$. There exists an efficient reduction from the $(1+\varepsilon)$-approximate nearest-neighbor problem to ANNC, where in the former problem the answer to a query is a curve $C\in\mathcal{C}$ with $\delta(Q,C)\le(1+\varepsilon)\cdot\delta(Q,C^*)$, where $C^*$ is the curve of $\mathcal{C}$ closest to $Q$. Given a set $\mathcal{C}$ of $n$ curves, each consisting of $m$ points in $d$ dimensions, we construct a data structure for ANNC that uses $n\cdot O(\frac{1}{\varepsilon})^{md}$ storage space and has $O(md)$ query time (for a query curve of length $m$), where the similarity between two curves is their discrete Fr\'echet or dynamic time warping distance. Our method is simple to implement, deterministic, and results in an exponential improvement in both query time and storage space compared to all previous bounds. Further, we also consider the asymmetric version of ANNC, where the length of the query curves is $k \ll m$, and obtain essentially the same storage and query bounds as above, except that $m$ is replaced by $k$. Finally, we apply our method to a version of approximate range counting for curves and achieve similar bounds.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.07562/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1902.07562/full.md

---
Source: https://tomesphere.com/paper/1902.07562