AI for Attack Identification, Response and Recovery (AIR²)
This WASP NEST project is a collaboration between Linköping University, Umeå University and KTH.
Within this project, we will build systems with high autonomy, high resource efficiency and operating in a hostile environment. Our project will aim to understand the fundamental benefits and limitations of closed loop automation when adopting ML for resource efficiency. Specifically, we will focus on building software-intensive high-performing communication infrastructures with the following three main objectives:
prevention of cyberthreats by anticipating and mitigating them,
accurate detection of ongoing attacks with well-understood basic components and AI methods, and
(partially) autonomous and explainable reaction to attacks in presence of resource trade-offs and complex potential changes over time (concept drift).
The project is split into five work packages (WPs) and our lab leads WP2:
Learning interpretable models for identifying attacks and countermeasures
Recent advancements in model-based planning have turned AI planning systems into powerful domain-independent sequential decision makers (Hoffmann and Nebel 2001; Richter and Westphal 2010; Seipp, Keller, and Helmert 2020). Nowadays, the primary bottleneck lies in accurate modeling rather than planning itself. Many current approaches for learning symbolic planning models from execution traces fall short because they assume full observability and no noise, thus limiting their scalability in real-world applications such as large communication systems (Arora et al. 2018; Lamanna et al. 2021). While reinforcement learning algorithms can learn policies without explicit modeling (Sutton and Barto 1998), they often face issues with sparse rewards, changes in their environment, and a lack of interpretability in the resulting policies.
Within WP2 we aim to merge the strengths of both approaches: learning symbolic planning models from data while employing RL to explore, but not control, an unknown state space. We will train RL agents to prioritize the exploration of lesser-known state space areas and use classical symbolic modeling to piece together the gathered information into a coherent planning model. To expand our models to large communication systems, we will also address issues related to partial observability and noisy sensor data.
Our research will contribute to the development of a semi-automatic network hardening loop that uses learned models to plan and respond to cybersecurity attacks. In detail, we will use off-the-shelf state-of-the-art planners to identify network attacks and modify network configurations accordingly to prevent future breaches. By continuously updating the learned models, we will iteratively identify and counter new attacks. In addition, we will use Stackelberg planning (Speicher et al. 2018) to simultaneously discover attacks and countermeasures.
Team for WP2
PI: Jendrik Seipp
Core team: One PhD student, Prof. Rolf Stadler (KTH)
Funding: This project is supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
Arora, Ankuj, Humbert Fiorino, Damien Pellier, Marc Métivier, and Sylvie Pesty. 2018. “A Review of Learning Planning Action Models.” Knowledge Engineering Review 33 (e20): 1–25.
Hoffmann, Jörg, and Bernhard Nebel. 2001. “The FF Planning System: Fast Plan Generation Through Heuristic Search.” JAIR 14: 253–302.
Lamanna, Leonardo, Alessandro Saetti, Luciano Serafini, Alfonso E. Gerevini, and Paolo Traverso. 2021. “Online Learning of Action Models for Pddl Planning.” In Proc. IJCAI 2021, 4112–8.
Richter, Silvia, and Matthias Westphal. 2010. “The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks.” JAIR 39: 127–77.
Seipp, Jendrik, Thomas Keller, and Malte Helmert. 2020. “Saturated Cost Partitioning for Optimal Classical Planning.” JAIR 67: 129–67.
Speicher, Patrick, Marcel Steinmetz, Michael Backes, Jörg Hoffmann, and Robert Künnemann. 2018. “Stackelberg Planning: Towards Effective Leader-Follower State Space Search.” In Proc. AAAI 2018, 6286–93.
Sutton, Richard S., and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press.