Locally Differentially Private Frequency Estimation via Joint Randomized Response
By: Ye Zheng , Shafizur Rahman Seeam , Yidan Hu and more
Potential Business Impact:
Protects your secrets while still learning from them.
Local Differential Privacy (LDP) has been widely recognized as a powerful tool for providing a strong theoretical guarantee of data privacy to data contributors against an untrusted data collector. Under a typical LDP scheme, each data contributor independently randomly perturbs their data before submitting them to the data collector, which in turn infers valuable statistics about the original data from received perturbed data. Common to existing LDP mechanisms is an inherent trade-off between the level of privacy protection and data utility in the sense that strong data privacy often comes at the cost of reduced data utility. Frequency estimation based on Randomized Response (RR) is a fundamental building block of many LDP mechanisms. In this paper, we propose a novel Joint Randomized Response (JRR) mechanism based on correlated data perturbations to achieve locally differentially private frequency estimation. JRR divides data contributors into disjoint groups of two members and lets those in the same group jointly perturb their binary data to improve frequency-estimation accuracy and achieve the same level of data privacy by hiding the group membership information in contrast to the classical RR mechanism. Theoretical analysis and detailed simulation studies using both real and synthetic datasets show that JRR achieves the same level of data privacy as the classical RR mechanism while improving the frequency-estimation accuracy in the overwhelming majority of the cases by up to two orders of magnitude.
Similar Papers
Frequency Estimation of Correlated Multi-attribute Data under Local Differential Privacy
Cryptography and Security
Keeps your private info safe, still useful.
Bipartite Randomized Response Mechanism for Local Differential Privacy
Cryptography and Security
Protects private data while still being useful.
Estimating the True Distribution of Data Collected with Randomized Response
Cryptography and Security
Finds true answers from private, mixed-up data.