Machine Learning from Explanations

Jiashu Tao; Reza Shokri

arXiv:2507.04788·cs.LG·July 8, 2025

Machine Learning from Explanations

Jiashu Tao, Reza Shokri

PDF

TL;DR

This paper proposes a novel training approach that uses explanation signals to improve model reliability and accuracy, especially on small or imbalanced datasets, by aligning model attention with input feature importance.

Contribution

It introduces a two-stage training cycle that incorporates explanation signals to guide models towards more rational and robust decision-making.

Findings

01

Accelerates convergence to accurate models

02

Enhances model reliability on small datasets

03

Reduces learning of spurious correlations

Abstract

Acquiring and training on large-scale labeled data can be impractical due to cost constraints. Additionally, the use of small training datasets can result in considerable variability in model outcomes, overfitting, and learning of spurious correlations. A crucial shortcoming of data labels is their lack of any reasoning behind a specific label assignment, causing models to learn any arbitrary classification rule as long as it aligns data with labels. To overcome these issues, we introduce an innovative approach for training reliable classification models on smaller datasets, by using simple explanation signals such as important input features from labeled data. Our method centers around a two-stage training cycle that alternates between enhancing model prediction accuracy and refining its attention to match the explanations. This instructs models to grasp the rationale behind label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.