A Trustworthiness Score to Evaluate DNN Predictions
Abanoub Ghobrial, Darryl Hond, Hamid Asgari, Kerstin Eder

TL;DR
This paper introduces a trustworthiness score (TS) for DNN predictions, providing a transparent and effective confidence measure that improves trust assessment and suspicious frame detection in autonomous systems.
Contribution
The paper proposes a novel trustworthiness score (TS) that enhances confidence evaluation of DNN predictions by checking for specific features, improving over traditional confidence scores.
Findings
TS improves prediction trustworthiness assessment by ~20%.
TS enhances suspicious frame detection accuracy by ~5%.
Method demonstrated on YOLOv5 for person detection.
Abstract
Due to the black box nature of deep neural networks (DNN), the continuous validation of DNN during operation is challenging with the absence of a human monitor. As a result this makes it difficult for developers and regulators to gain confidence in the deployment of autonomous systems employing DNN. It is critical for safety during operation to know when DNN's predictions are trustworthy or suspicious. With the absence of a human monitor, the basic approach is to use the model's output confidence score to assess if predictions are trustworthy or suspicious. However, the model's confidence score is a result of computations coming from a black box, therefore lacks transparency and makes it challenging to automatedly credit trustworthiness to predictions. We introduce the trustworthiness score (TS), a simple metric that provides a more transparent and effective way of providing confidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications
MethodsSpatio-temporal stability analysis
