Byzantine Outside, Curious Inside: Reconstructing Data Through Malicious Updates
By: Kai Yue , Richeng Jin , Chau-Wai Wong and more
Potential Business Impact:
Makes private data easier to steal.
Federated learning (FL) enables decentralized machine learning without sharing raw data, allowing multiple clients to collaboratively learn a global model. However, studies reveal that privacy leakage is possible under commonly adopted FL protocols. In particular, a server with access to client gradients can synthesize data resembling the clients' training data. In this paper, we introduce a novel threat model in FL, named the maliciously curious client, where a client manipulates its own gradients with the goal of inferring private data from peers. This attacker uniquely exploits the strength of a Byzantine adversary, traditionally aimed at undermining model robustness, and repurposes it to facilitate data reconstruction attack. We begin by formally defining this novel client-side threat model and providing a theoretical analysis that demonstrates its ability to achieve significant reconstruction success during FL training. To demonstrate its practical impact, we further develop a reconstruction algorithm that combines gradient inversion with malicious update strategies. Our analysis and experimental results reveal a critical blind spot in FL defenses: both server-side robust aggregation and client-side privacy mechanisms may fail against our proposed attack. Surprisingly, standard server- and client-side defenses designed to enhance robustness or privacy may unintentionally amplify data leakage. Compared to the baseline approach, a mistakenly used defense may instead improve the reconstructed image quality by 10-15%.
Similar Papers
Byzantine-Robust Federated Learning Using Generative Adversarial Networks
Cryptography and Security
Keeps AI learning safe from bad data.
Toward Malicious Clients Detection in Federated Learning
Cryptography and Security
Finds bad guys in computer learning teams.
Boosting Gradient Leakage Attacks: Data Reconstruction in Realistic FL Settings
Machine Learning (CS)
Steals private data from shared computer learning.