Quanda: An Interpretability Toolkit for Training Data Attribution   Evaluation and Beyond

Dilyara Bareeva; Galip \"Umit Yolcu; Anna Hedstr\"om; Niklas; Schmolenski; Thomas Wiegand; Wojciech Samek; Sebastian Lapuschkin

arXiv:2410.07158·cs.LG·October 11, 2024

Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond

Dilyara Bareeva, Galip \"Umit Yolcu, Anna Hedstr\"om, Niklas, Schmolenski, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin

PDF

Open Access 1 Repo

TL;DR

Quanda is an open-source Python toolkit that standardizes the evaluation and benchmarking of training data attribution methods, enhancing their interpretability and trustworthiness in neural network analysis.

Contribution

It introduces a unified framework and comprehensive metrics for evaluating TDA methods, facilitating systematic comparison and broader adoption.

Findings

01

Provides a comprehensive set of evaluation metrics

02

Enables seamless integration with existing TDA tools

03

Supports systematic benchmarking of TDA methods

Abstract

In recent years, training data attribution (TDA) methods have emerged as a promising direction for the interpretability of neural networks. While research around TDA is thriving, limited effort has been dedicated to the evaluation of attributions. Similar to the development of evaluation metrics for traditional feature attribution approaches, several standalone metrics have been proposed to evaluate the quality of TDA methods across various contexts. However, the lack of a unified framework that allows for systematic comparison limits trust in TDA methods and stunts their widespread adoption. To address this research gap, we introduce Quanda, a Python toolkit designed to facilitate the evaluation of TDA methods. Beyond offering a comprehensive set of evaluation metrics, Quanda provides a uniform interface for seamless integration with existing TDA implementations across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dilyabareeva/quanda
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Natural Language Processing Techniques

MethodsSparse Evolutionary Training · Lib