Adaptive Network Security Policies via Belief Aggregation and Rollout
By: Kim Hammar , Yuchao Li , Tansu Alpcan and more
Potential Business Impact:
Automates computer network defense updates quickly and reliably.
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Similar Papers
Adaptive Security Policy Management in Cloud Environments Using Reinforcement Learning
Cryptography and Security
Makes cloud security smarter and faster.
Efficient Adaptation of Reinforcement Learning Agents to Sudden Environmental Change
Machine Learning (CS)
Helps robots learn new tricks without forgetting old ones.
Learning on the Fly: Rapid Policy Adaptation via Differentiable Simulation
Robotics
Robots learn to fix mistakes instantly in real world.