Leveraging Randomness in Model and Data Partitioning for Privacy Amplification
By: Andy Dong, Wei-Ning Chen, Ayfer Ozgur
Potential Business Impact:
Keeps your private data safe when training computers.
We study how inherent randomness in the training process -- where each sample (or client in federated learning) contributes only to a randomly selected portion of training -- can be leveraged for privacy amplification. This includes (1) data partitioning, where a sample participates in only a subset of training iterations, and (2) model partitioning, where a sample updates only a subset of the model parameters. We apply our framework to model parallelism in federated learning, where each client updates a randomly selected subnetwork to reduce memory and computational overhead, and show that existing methods, e.g. model splitting or dropout, provide a significant privacy amplification gain not captured by previous privacy analysis techniques. Additionally, we introduce Balanced Iteration Subsampling, a new data partitioning method where each sample (or client) participates in a fixed number of training iterations. We show that this method yields stronger privacy amplification than Poisson (i.i.d.) sampling of data (or clients). Our results demonstrate that randomness in the training process, which is structured rather than i.i.d. and interacts with data in complex ways, can be systematically leveraged for significant privacy amplification.
Similar Papers
Differential Privacy Personalized Federated Learning Based on Dynamically Sparsified Client Updates
Machine Learning (CS)
Keeps your private data safe during AI training.
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates
Machine Learning (CS)
Keeps private data safe when training AI models.
Decomposition-Based Optimal Bounds for Privacy Amplification via Shuffling
Cryptography and Security
Protects your data better when shared online.