Systematic Literature Review of Validation Methods for AI Systems
Lalli Myllyaho, Mikko Raatikainen, Tomi M\"annist\"o, Tommi Mikkonen, and Jukka K. Nurminen

TL;DR
This paper systematically reviews validation methods for AI systems, classifying them into a taxonomy, highlighting current practices and gaps, especially in continuous validation, to improve AI dependability in practical applications.
Contribution
It provides a comprehensive taxonomy of AI validation methods from 90 studies, clarifying current strategies and identifying gaps in continuous validation practices.
Findings
Validation methods include trial, simulation, model-centred validation, and expert opinion.
Methods for post-deployment validation include failure monitors and safety channels.
Few studies report on continuous validation practices.
Abstract
Context: Artificial intelligence (AI) has made its way into everyday activities, particularly through new techniques such as machine learning (ML). These techniques are implementable with little domain knowledge. This, combined with the difficulty of testing AI systems with traditional methods, has made system trustworthiness a pressing issue. Objective: This paper studies the methods used to validate practical AI systems reported in the literature. Our goal is to classify and describe the methods that are used in realistic settings to ensure the dependability of AI systems. Method: A systematic literature review resulted in 90 papers. Systems presented in the papers were analysed based on their domain, task, complexity, and applied validation methods. Results: The validation methods were synthesized into a taxonomy consisting of trial, simulation, model-centred validation, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Software Reliability and Analysis Research · Autonomous Vehicle Technology and Safety
