I-trustworthy Models. A framework for trustworthiness evaluation of probabilistic classifiers
Ritwik Vashistha, Arya Farahi

TL;DR
This paper introduces I-trustworthy, a new framework for evaluating the trustworthiness of probabilistic classifiers by linking local calibration to trust, using a kernel-based hypothesis test and providing theoretical guarantees.
Contribution
It formalizes the I-trustworthy framework, develops the Kernel Local Calibration Error (KLCE) test, and offers tools for diagnosing biases and assessing existing recalibration methods.
Findings
KLCE effectively tests local calibration in classifiers.
Theoretical convergence bounds for KLCE are established.
Existing recalibration methods often fail to achieve I-trustworthiness.
Abstract
As probabilistic models continue to permeate various facets of our society and contribute to scientific advancements, it becomes a necessity to go beyond traditional metrics such as predictive accuracy and error rates and assess their trustworthiness. Grounded in the competence-based theory of trust, this work formalizes I-trustworthy framework -- a novel framework for assessing the trustworthiness of probabilistic classifiers for inference tasks by linking local calibration to trustworthiness. To assess I-trustworthiness, we use the local calibration error (LCE) and develop a method of hypothesis-testing. This method utilizes a kernel-based test statistic, Kernel Local Calibration Error (KLCE), to test local calibration of a probabilistic classifier. This study provides theoretical guarantees by offering convergence bounds for an unbiased estimator of KLCE. Additionally, we present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
