air2 | RLPLAB

AI for Attack Identification, Response and Recovery (AIR²)

We are currently looking for a PhD student working on this project.

This WASP NEST project is a collaboration between Linköping University, Umeå University and KTH.

Within this project, we will build systems with high autonomy, high resource efficiency and operating in a hostile environment. Our project will aim to understand the fundamental benefits and limitations of closed loop automation when adopting ML for resource efficiency. Specifically, we will focus on building software-intensive high-performing communication infrastructures with the following three main objectives:

prevention of cyberthreats by anticipating and mitigating them,
accurate detection of ongoing attacks with well-understood basic components and AI methods, and
(partially) autonomous and explainable reaction to attacks in presence of resource trade-offs and complex potential changes over time (concept drift).

The project is split into five work packages (WPs) and our lab leads WP2:

Learning interpretable models for identifying attacks and countermeasures

Recent advancements in model-based planning have turned AI planning systems into powerful domain-independent sequential decision makers (Hoffmann and Nebel 2001; Richter and Westphal 2010; Seipp, Keller, and Helmert 2020). Nowadays, the primary bottleneck lies in accurate modeling rather than planning itself. Many current approaches for learning symbolic planning models from execution traces fall short because they assume full observability and no noise, thus limiting their scalability in real-world applications such as large communication systems (Arora et al. 2018; Lamanna et al. 2021). While reinforcement learning algorithms can learn policies without explicit modeling (Sutton and Barto 1998), they often face issues with sparse rewards, changes in their environment, and a lack of interpretability in the resulting policies.

Within WP2 we aim to merge the strengths of both approaches: learning symbolic planning models from data while employing RL to explore, but not control, an unknown state space. We will train RL agents to prioritize the exploration of lesser-known state space areas and use classical symbolic modeling to piece together the gathered information into a coherent planning model. To expand our models to large communication systems, we will also address issues related to partial observability and noisy sensor data.

Our research will contribute to the development of a semi-automatic network hardening loop that uses learned models to plan and respond to cybersecurity attacks. In detail, we will use off-the-shelf state-of-the-art planners to identify network attacks and modify network configurations accordingly to prevent future breaches. By continuously updating the learned models, we will iteratively identify and counter new attacks. In addition, we will use Stackelberg planning (Speicher et al. 2018) to simultaneously discover attacks and countermeasures.

Team for WP2

PI: Jendrik Seipp

Core team: One PhD student, Prof. Rolf Stadler (KTH)

Funding: This project is supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

AI for Attack Identification, Response and Recovery (AIR²)

Learning interpretable models for identifying attacks and countermeasures

Team for WP2

Bibliography