TRUST: Test-time Resource Utilization for Superior Trustworthiness
Haripriya Harikumar, Santu Rana

TL;DR
This paper introduces TRUST, a test-time optimization method that improves confidence estimates and out-of-distribution detection by accounting for classifier noise, enhancing trustworthiness in predictions.
Contribution
It proposes a novel test-time resource utilization technique that yields more reliable confidence scores and better OOD detection compared to existing methods.
Findings
Improves risk-based metrics like AUSE and AURC.
Effectively detects distribution shifts and OOD samples.
Differentiates CNN and ViT classifier behaviors across datasets.
Abstract
Standard uncertainty estimation techniques, such as dropout, often struggle to clearly distinguish reliable predictions from unreliable ones. We attribute this limitation to noisy classifier weights, which, while not impairing overall class-level predictions, render finer-level statistics less informative. To address this, we propose a novel test-time optimization method that accounts for the impact of such noise to produce more reliable confidence estimates. This score defines a monotonic subset-selection function, where population accuracy consistently increases as samples with lower scores are removed, and it demonstrates superior performance in standard risk-based metrics such as AUSE and AURC. Additionally, our method effectively identifies discrepancies between training and test distributions, reliably differentiates in-distribution from out-of-distribution samples, and elucidates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications
