Human-centred test and evaluation of military AI
David Helmer, Michael Boardman, S. Kate Conroy, Adam J. Hepworth,, Manoj Harjani

TL;DR
This paper emphasizes the importance of human-centered test, evaluation, verification, and validation (TEVV) frameworks for military AI systems, advocating for ongoing monitoring, standards development, and improved communication among stakeholders.
Contribution
It proposes adapting human-centered evaluation methods for deployed military AI, emphasizing lifecycle involvement of humans and the development of standards and metrics for responsible AI deployment.
Findings
Need for human-in-the-loop TEVV throughout system lifecycle
Development of standards and metrics for human-centered AI evaluation
Enhanced communication to inform risk-based decision making
Abstract
The REAIM 2024 Blueprint for Action states that AI applications in the military domain should be ethical and human-centric and that humans must remain responsible and accountable for their use and effects. Developing rigorous test and evaluation, verification and validation (TEVV) frameworks will contribute to robust oversight mechanisms. TEVV in the development and deployment of AI systems needs to involve human users throughout the lifecycle. Traditional human-centred test and evaluation methods from human factors need to be adapted for deployed AI systems that require ongoing monitoring and evaluation. The language around AI-enabled systems should be shifted to inclusion of the human(s) as a component of the system. Standards and requirements supporting this adjusted definition are needed, as are metrics and means to evaluate them. The need for dialogue between technologists and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety · Anomaly Detection Techniques and Applications · Healthcare Technology and Patient Monitoring
MethodsNetwork On Network
