Transparent Trade-offs between Properties of Explanations

Hiwot Belay Tadesse; Alihan H\"uy\"uk; Yaniv Yacoby; Weiwei Pan; Finale Doshi-Velez

arXiv:2410.23880·cs.LG·July 22, 2025

Transparent Trade-offs between Properties of Explanations

Hiwot Belay Tadesse, Alihan H\"uy\"uk, Yaniv Yacoby, Weiwei Pan, Finale Doshi-Velez

PDF

Open Access

TL;DR

This paper introduces a direct optimization method for explanations of black-box models, enabling better property control and trade-offs, overcoming limitations of previous approaches that merely encourage certain explanation properties.

Contribution

It proposes a novel direct optimization approach for explanations, allowing explicit control over property trade-offs and improving the consistency of desired explanation properties.

Findings

01

Direct optimization yields explanations with better property adherence.

02

Users can customize explanations for specific tasks.

03

The method outperforms encouraging-based approaches in property consistency.

Abstract

When explaining black-box machine learning models, it's often important for explanations to have certain desirable properties. Most existing methods `encourage' desirable properties in their construction of explanations. In this work, we demonstrate that these forms of encouragement do not consistently create explanations with the properties that are supposedly being targeted. Moreover, they do not allow for any control over which properties are prioritized when different properties are at odds with each other. We propose to directly optimize explanations for desired properties. Our direct approach not only produces explanations with optimal properties more consistently but also empowers users to control trade-offs between different properties, allowing them to create explanations with exactly what is needed for a particular task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Machine Learning in Materials Science