Random Projections and Sampling Algorithms for Clustering of High-Dimensional Polygonal Curves
Stefan Meintrup, Alexander Munteanu, Dennis Rohde

TL;DR
This paper introduces a Johnson-Lindenstrauss projection for high-dimensional polygonal curves to enable efficient clustering using the Fréchet distance, with theoretical error analysis and empirical validation.
Contribution
It proposes a novel projection method for curves, analyzes the approximation limits of Fréchet distance, and provides a CUDA-accelerated clustering algorithm.
Findings
Sublinear complexity in clustering algorithms
Fréchet distance cannot be approximated within factor less than √2
Empirical validation of the proposed methods
Abstract
We study the -median clustering problem for high-dimensional polygonal curves with finite but unbounded number of vertices. We tackle the computational issue that arises from the high number of dimensions by defining a Johnson-Lindenstrauss projection for polygonal curves. We analyze the resulting error in terms of the Fr\'echet distance, which is a tractable and natural dissimilarity measure for curves. Our clustering algorithms achieve sublinear dependency on the number of input curves via subsampling. Also, we show that the Fr\'echet distance can not be approximated within any factor of less than by probabilistically reducing the dependency on the number of vertices of the curves. As a consequence we provide a fast, CUDA-parallelized version of the Alt and Godau algorithm for computing the Fr\'echet distance and use it to evaluate our results empirically.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Image Processing and 3D Reconstruction · Data Management and Algorithms
