Rising cyber threats and workforce gaps
29/04/2025
As military systems become more complex and interconnected, timely information-sharing is critical to mission success. At the same time, cyber threats are growing in sophistication. Combined with an industry shortage of human cyber operators, these challenges point to the need for autonomous systems in cyber defence. However, deploying AI in cyber defence requires more than innovation, it demands assurance.
To adopt AI-based autonomous agents for cyber defence, a robust test and evaluation (T&E) process is essential. Such a process must ensure these agents work as expected, meet user requirements and are robust, ethical, safe and secure.
As part of its four-year Autonomous Resilient Cyber Defence (ARCD) programme, QinetiQ has developed Dstl’s blueprint for T&E and demonstrated its application to interactive data-driven cyber-defence agents trained by a third party.
The T&E process
The T&E blueprint for autonomous cyber defence of military platforms consists of six key phases. The process is:
- Iterative: built around cycles of evaluation and refinement
- Evidence-based: focused on reducing uncertainty
- Risk-focused: based on identifying, estimating, reporting and updating key risks
At its core, the process builds evidence of whether an autonomous agent is safe and fit for purpose.
The Demonstration Agent project
The Demonstration Agent Project was delivered in collaboration with:
- Applied Data Science Partners (ADSP), BMT and Frazer-Nash Consultancy (who developed and trained the agents).
- QinetiQ (which acted as the independent agent evaluators).
The agents were trained and evaluated in the ARCD simulation environment, PrimAITE , and the best-performing agent was then evaluated in a more realistic environment, PalisAIDE.
Both environments were based on a realistic military communication network, including:
- Two types of cyber-realistic adversarial agent
- A range of network users
- Four Mission Objectives focused on military and network-specific objectives for each type of network user.
Evaluation approach
The evaluation used Mission Success Criteria (MSC) - a measure of how well the defensive agent enabled the mission objectives to be met. This made results meaningful for military and network stakeholders.
Key comparisons included:
- System performance with no attacks or defence (baseline)
- Impact of cyber-attacks by adversarial agents
- The defensive agent’s ability to recover the network after an attack
- Any negative effects the defensive agent had on mission success in the absence of a cyber-attack
- How well the agent generalised across different scenarios
Using multiple MSC allowed evaluators to identify the relative strengths and weaknesses of each agent and target improvements accordingly.
Interpreting the results
In addition to interpreting the MSC, the evaluation:
- developed interpretability tools to understand agent behaviour
- investigated how agent performance changed between the simulation and more realistic environments
This revealed sim-to-real gaps in how environments were configured and showed how more varied training could improve agent robustness in future applications.
Shape the future of autonomous cyber defence
Successfully demonstrating a Test & Evaluation process for autonomous cyber defence agents marks a significant milestone. However, to transition from demonstration to deployment, further work is essential. Key areas requiring future exploration and investment include:
- Developing acceptance frameworks to ensure autonomous agents meet stringent defence standards
- Enhancing human-agent teaming, focusing on seamless interaction between autonomous systems and human cyber operators
- Advancing tools and integration platforms to facilitate the deployment of these agents within operational military environments
This progression necessitates collaborative efforts. We invite interested parties to engage in shaping the future of autonomous cyber defence.
Get in touch
For collaboration opportunities and further discussion, contact: ARCD-Track2@qinetiq.com