Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Lam Nguyen Tung, Steven Cho, Xiaoning Du, Neelofar Neelofar, Valerio Terragni, Stefano Ruberto, Aldeida Aleti

TL;DR
This paper introduces TOKI, an automated method for generating trustworthiness oracles for text classifiers, which improves trust assessment accuracy and guides adversarial attacks more effectively than existing methods.
Contribution
TOKI is the first automated trustworthiness oracle for text classifiers, leveraging semantic relatedness of explanation words to class labels, and includes a novel adversarial attack targeting trustworthiness vulnerabilities.
Findings
TOKI achieves 142% higher accuracy than naive confidence-based methods.
TOKI-guided attack is more effective with fewer perturbations than state-of-the-art A2T.
Prediction uncertainty alone is insufficient for trustworthiness assessment.
Abstract
Machine learning (ML) for text classification has been widely used in various domains. These applications can significantly impact ethics, economics, and human behavior, raising serious concerns about trusting ML decisions. Studies indicate that conventional metrics are insufficient to build human trust in ML models. These models often learn spurious correlations and predict based on them. In the real world, their performance can deteriorate significantly. To avoid this, a common practice is to test whether predictions are reasonable based on valid patterns in the data. Along with this, a challenge known as the trustworthiness oracle problem has been introduced. Due to the lack of automated trustworthiness oracles, the assessment requires manual validation of the decision process disclosed by explanation methods. However, this is time-consuming, error-prone, and unscalable. We propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Data Security Solutions · Blockchain Technology Applications and Security
