Approximate DBSCAN under Differential Privacy
By: Yuan Qiu, Ke Yi
Potential Business Impact:
Keeps data private while finding groups.
This paper revisits the DBSCAN problem under differential privacy (DP). Existing DP-DBSCAN algorithms aim at publishing the cluster labels of the input points. However, we show that both empirically and theoretically, this approach cannot offer any utility in the published results. We therefore propose an alternative definition of DP-DBSCAN based on the notion of spans. We argue that publishing the spans actually better serves the purposes of visualization and classification of DBSCAN. Then we present a linear-time DP-DBSCAN algorithm achieving the sandwich quality guarantee in any constant dimensions, as well as matching lower bounds on the approximation ratio. A key building block in our algorithm is a linear-time algorithm for constructing a histogram under pure-DP, which is of independent interest. Finally, we conducted experiments on both synthetic and real-world datasets to verify the practical performance of our DP-DBSCAN algorithm.
Similar Papers
Approximate DBSCAN under Differential Privacy
Cryptography and Security
Keeps data private while finding groups.
Local Distance Query with Differential Privacy
Cryptography and Security
Keeps your online map private when finding directions.
Spectral Graph Clustering under Differential Privacy: Balancing Privacy, Accuracy, and Efficiency
Information Theory
Keeps online connections private while still working.