Score: 1

ProHD: Projection-Based Hausdorff Distance Approximation

Published: November 22, 2025 | arXiv ID: 2511.18207v1

By: Jiuzhou Fu , Luanzheng Guo , Nathan R. Tallent and more

BigTech Affiliations: University of Washington

Potential Business Impact:

Finds how different shapes are faster.

Business Areas:
Big Data Data and Analytics

The Hausdorff distance (HD) is a robust measure of set dissimilarity, but computing it exactly on large, high-dimensional datasets is prohibitively expensive. We propose \textbf{ProHD}, a projection-guided approximation algorithm that dramatically accelerates HD computation while maintaining high accuracy. ProHD identifies a small subset of candidate "extreme" points by projecting the data onto a few informative directions (such as the centroid axis and top principal components) and computing the HD on this subset. This approach guarantees an underestimate of the true HD with a bounded additive error and typically achieves results within a few percent of the exact value. In extensive experiments on image, physics, and synthetic datasets (up to two million points in $D=256$), ProHD runs 10--100$\times$ faster than exact algorithms while attaining 5--20$\times$ lower error than random sampling-based approximations. Our method enables practical HD calculations in scenarios like large vector databases and streaming data, where quick and reliable set distance estimation is needed.

Country of Origin
🇺🇸 United States

Page Count
10 pages

Category
Computer Science:
Information Retrieval