TL;DR
This paper argues that post-hoc explanation algorithms are ineffective in adversarial contexts, especially under legal obligations, due to inherent conflicts of interest and ambiguity, necessitating alternative transparency mechanisms.
Contribution
It combines legal, philosophical, and technical analysis to demonstrate the limitations of post-hoc explanations in adversarial settings and highlights the need for alternative approaches.
Findings
Post-hoc explanations are manipulable in adversarial contexts.
Legal transparency objectives cannot be met with current explanation methods.
There is a need for alternative mechanisms to achieve explainability.
Abstract
Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to "explain". Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives. Indeed, most situations where explanations are requested are adversarial, meaning that the explanation provider and receiver have opposing interests and incentives, so that the provider might manipulate the explanation for her own ends. We show that this fundamental conflict cannot be resolved because of the high degree of ambiguity of post-hoc explanations in realistic application scenarios. As a consequence, post-hoc explanation algorithms are unsuitable to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
