How to Evaluate Explainability? -- A Case for Three Criteria
Timo Speith

TL;DR
This paper advocates for three key criteria—comprehensibility, fidelity, and assessability—to evaluate the explainability of software systems, emphasizing the need for consensus and appropriate methods in this emerging field.
Contribution
It introduces three core criteria for evaluating explainability, fostering discussion and development of suitable evaluation methods in the field.
Findings
Proposes three criteria for explainability evaluation
Highlights the lack of consensus on evaluation methods
Encourages multidisciplinary discussion on evaluation standards
Abstract
The increasing complexity of software systems and the influence of software-supported decisions in our society have sparked the need for software that is safe, reliable, and fair. Explainability has been identified as a means to achieve these qualities. It is recognized as an emerging non-functional requirement (NFR) that has a significant impact on system quality. However, in order to develop explainable systems, we need to understand when a system satisfies this NFR. To this end, appropriate evaluation methods are required. However, the field is crowded with evaluation methods, and there is no consensus on which are the "right" ones. Much less, there is not even agreement on which criteria should be evaluated. In this vision paper, we will provide a multidisciplinary motivation for three such quality criteria concerning the information that systems should provide: comprehensibility,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Engineering Techniques and Practices
