Federated Learning in the Wild: A Comparative Study for Cybersecurity under Non-IID and Unbalanced Settings
By: Roberto Doriguzzi-Corin , Petr Sabel , Silvio Cretti and more
Potential Business Impact:
Helps computers find online attacks without sharing private data.
Machine Learning (ML) techniques have shown strong potential for network traffic analysis; however, their effectiveness depends on access to representative, up-to-date datasets, which is limited in cybersecurity due to privacy and data-sharing restrictions. To address this challenge, Federated Learning (FL) has recently emerged as a novel paradigm that enables collaborative training of ML models across multiple clients while ensuring that sensitive data remains local. Nevertheless, Federated Averaging (FedAvg), the canonical FL algorithm, has proven poor convergence in heterogeneous environments where data distributions are non-independent and identically distributed (i.i.d.) and client datasets are unbalanced, conditions frequently observed in cybersecurity contexts. To overcome these challenges, several alternative FL strategies have been developed, yet their applicability to network intrusion detection remains insufficiently explored. This study systematically reviews and evaluates a range of FL methods in the context of intrusion detection for DDoS attacks. Using a dataset of network attacks within a Kubernetes-based testbed, we assess convergence efficiency, computational overhead, bandwidth consumption, and model accuracy. To the best of our knowledge, this is the first comparative analysis of FL algorithms for intrusion detection under realistic non-i.i.d. and unbalanced settings, providing new insights for the design of robust, privacypreserving network security solutions.
Similar Papers
A Robust Federated Learning Approach for Combating Attacks Against IoT Systems Under non-IID Challenges
Machine Learning (CS)
Helps computers learn to spot internet dangers.
Anomaly Detection in Electric Vehicle Charging Stations Using Federated Learning
Machine Learning (CS)
Secures electric car chargers without spying.
A Comparative Benchmark of Federated Learning Strategies for Mortality Prediction on Heterogeneous and Imbalanced Clinical Data
Machine Learning (CS)
Helps doctors predict patient deaths better, safely.