Tight Differentially Private PCA via Matrix Coherence
By: Tommaso d'Orsi, Gleb Novikov
Potential Business Impact:
Keeps private data safe when finding patterns.
We revisit the task of computing the span of the top $r$ singular vectors $u_1, \ldots, u_r$ of a matrix under differential privacy. We show that a simple and efficient algorithm -- based on singular value decomposition and standard perturbation mechanisms -- returns a private rank-$r$ approximation whose error depends only on the \emph{rank-$r$ coherence} of $u_1, \ldots, u_r$ and the spectral gap $\sigma_r - \sigma_{r+1}$. This resolves a question posed by Hardt and Roth~\cite{hardt2013beyond}. Our estimator outperforms the state of the art -- significantly so in some regimes. In particular, we show that in the dense setting, it achieves the same guarantees for single-spike PCA in the Wishart model as those attained by optimal non-private algorithms, whereas prior private algorithms failed to do so. In addition, we prove that (rank-$r$) coherence does not increase under Gaussian perturbations. This implies that any estimator based on the Gaussian mechanism -- including ours -- preserves the coherence of the input. We conjecture that similar behavior holds for other structured models, including planted problems in graphs. We also explore applications of coherence to graph problems. In particular, we present a differentially private algorithm for Max-Cut and other constraint satisfaction problems under low coherence assumptions.
Similar Papers
High-Dimensional Asymptotics of Differentially Private PCA
Statistics Theory
Makes private data analysis more accurate.
An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
Machine Learning (Stat)
Finds important patterns in data privately.
Spectral Perturbation Bounds for Low-Rank Approximation with Applications to Privacy
Machine Learning (CS)
Protects private data while finding important patterns.