Score: 0

Approximate DBSCAN under Differential Privacy

Published: August 12, 2025 | arXiv ID: 2508.08749v2

By: Yuan Qiu, Ke Yi

Potential Business Impact:

Keeps data private while finding groups.

This paper revisits the DBSCAN problem under differential privacy (DP). Existing DP-DBSCAN algorithms aim at publishing the cluster labels of the input points. However, we show that both empirically and theoretically, this approach cannot offer any utility in the published results. We therefore propose an alternative definition of DP-DBSCAN based on the notion of spans. We argue that publishing the spans actually better serves the purposes of visualization and classification of DBSCAN. Then we present a linear-time DP-DBSCAN algorithm achieving the sandwich quality guarantee in any constant dimensions, as well as matching lower bounds on the approximation ratio. A key building block in our algorithm is a linear-time algorithm for constructing a histogram under pure-DP, which is of independent interest. Finally, we conducted experiments on both synthetic and real-world datasets to verify the practical performance of our DP-DBSCAN algorithm.

Country of Origin
🇭🇰 Hong Kong

Page Count
21 pages

Category
Computer Science:
Cryptography and Security