An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
By: Johanna Düngler, Amartya Sanyal
Potential Business Impact:
Finds important patterns in data privately.
Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $\Sigma$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $\Sigma$, while preserving differential privacy (DP) of each individual $A_i$. Existing methods either (i) require the sample size $n$ to scale super-linearly with dimension $d$, even under Gaussian assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed these issues for sub-Gaussian data but only for estimating the top eigenvector ($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming both limitations above. For $k=1$ our algorithm matches the utility guarantees of DP-PCA, achieving near-optimal statistical error even when $n = \tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching our upper bound up to a factor of $k$, and experimentally demonstrate the advantages of our algorithm over comparable baselines.
Similar Papers
High-Dimensional Asymptotics of Differentially Private PCA
Statistics Theory
Makes private data analysis more accurate.
Tight Differentially Private PCA via Matrix Coherence
Machine Learning (CS)
Keeps private data safe when finding patterns.
On Purely Private Covariance Estimation
Machine Learning (CS)
Keeps data private while still learning from it.