Improving the trustworthiness of image classification models by utilizing bounding-box annotations
Dharma KC, Chicheng Zhang

TL;DR
This paper introduces a method that uses bounding-box annotations during training to enhance the accuracy, robustness, and interpretability of image classification models.
Contribution
It proposes a novel training objective that leverages bounding-box annotations to improve model trustworthiness in image classification tasks.
Findings
Improved accuracy over baseline models
Enhanced robustness to adversarial examples
Greater interpretability of model decisions
Abstract
We study utilizing auxiliary information in training data to improve the trustworthiness of machine learning models. Specifically, in the context of image classification, we propose to optimize a training objective that incorporates bounding box information, which is available in many image classification datasets. Preliminary experimental results show that the proposed algorithm achieves better performance in accuracy, robustness, and interpretability compared with baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications
