ProHD: Projection-Based Hausdorff Distance Approximation
Jiuzhou Fu, Luanzheng Guo, Nathan R. Tallent, Dongfang Zhao

TL;DR
ProHD is a projection-based approximation method that significantly speeds up Hausdorff distance computation on large, high-dimensional datasets while maintaining high accuracy, enabling practical applications in large-scale data analysis.
Contribution
ProHD introduces a novel projection-guided approximation algorithm for Hausdorff distance that guarantees bounded error and outperforms existing sampling methods in speed and accuracy.
Findings
Runs 10-100x faster than exact algorithms on large datasets.
Achieves 5-20x lower error than random sampling approximations.
Effective on datasets with up to two million points in 256 dimensions.
Abstract
The Hausdorff distance (HD) is a robust measure of set dissimilarity, but computing it exactly on large, high-dimensional datasets is prohibitively expensive. We propose \textbf{ProHD}, a projection-guided approximation algorithm that dramatically accelerates HD computation while maintaining high accuracy. ProHD identifies a small subset of candidate "extreme" points by projecting the data onto a few informative directions (such as the centroid axis and top principal components) and computing the HD on this subset. This approach guarantees an underestimate of the true HD with a bounded additive error and typically achieves results within a few percent of the exact value. In extensive experiments on image, physics, and synthetic datasets (up to two million points in ), ProHD runs 10--100 faster than exact algorithms while attaining 5--20 lower error than random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Data Management and Algorithms · Stochastic Gradient Optimization Techniques
