On the Consistency and Performance of the Iterative Bayesian Update
By: Ehab ElSalamouny, Catuscia Palamidessi
Potential Business Impact:
Lets computers guess private info without seeing it.
For many social, scientific, and commercial purposes, it is often important to estimate the distribution of the users' data regarding a sensitive attribute, e.g., their ages, locations, etc. To allow this estimation while protecting the users' privacy, every user applies a local privacy protection mechanism that releases a noisy (sanitized) version of their original datum to the data collector; then the original distribution is estimated using one of the known methods, such as the matrix inversion (INV), RAPPOR's estimator, and the iterative Bayesian update (IBU). Unlike the other estimators, the consistency of IBU, i.e., the convergence of its estimate to the real distribution as the amount of noisy data grows, has been either ignored or incorrectly proved in the literature. In this article, we use the fact that IBU is a maximum likelihood estimator to prove that IBU is consistent. We also show, through experiments on real datasets, that IBU significantly outperforms the other methods when the users' data are sanitized by geometric, Laplace, and exponential mechanisms, whereas it is comparable to the other methods in the case of the k-RR and RAPPOR mechanisms. Finally, we consider the case when the alphabet of the sensitive data is infinite, and we show a technique that allows IBU to operate in this case too.
Similar Papers
Statistical Inference under Adaptive Sampling with LinUCB
Statistics Theory
Makes computer learning more accurate and trustworthy.
Bernstein-von Mises for Adaptively Collected Data
Statistics Theory
Makes smart learning systems safer and more reliable.
Stable and Convexified Information Bottleneck Optimization via Symbolic Continuation and Entropy-Regularized Trajectories
Machine Learning (CS)
Makes AI learn better, without breaking.