Stochastic Amortization: A Unified Approach to Accelerate Feature and   Data Attribution

Ian Covert; Chanwoo Kim; Su-In Lee; James Zou; Tatsunori Hashimoto

arXiv:2401.15866·cs.LG·October 31, 2024·1 cites

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution

Ian Covert, Chanwoo Kim, Su-In Lee, James Zou, Tatsunori Hashimoto

PDF

Open Access 3 Repos

TL;DR

This paper introduces a stochastic amortization method that uses noisy labels to efficiently accelerate feature attribution and data valuation tasks in explainable machine learning, achieving significant speedups.

Contribution

It proposes a novel approach to train amortized models with noisy labels, enabling faster approximations in large-scale explainability tasks.

Findings

01

High noise levels are tolerated in training

02

Achieves up to an order of magnitude speedup

03

Effective across various models and datasets

Abstract

Many tasks in explainable machine learning, such as data valuation and feature attribution, perform expensive computation for each data point and are intractable for large datasets. These methods require efficient approximations, and although amortizing the process by learning a network to directly predict the desired output is a promising solution, training such models with exact labels is often infeasible. We therefore explore training amortized models with noisy labels, and we find that this is inexpensive and surprisingly effective. Through theoretical analysis of the label noise and experiments with various models and datasets, we show that this approach tolerates high noise levels and significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification