TL;DR
This paper shows that SHAP explanations for feature importance in tabular data are highly sensitive to how data is represented, which can be exploited to hide biases or discrimination.
Contribution
It reveals that common data engineering choices can manipulate SHAP explanations, highlighting a new vulnerability in local feature-based explanations.
Findings
Feature importance can be manipulated by data representation choices.
Explanations are sensitive to simple data transformations like histograms.
Adversaries can exploit this sensitivity to obscure biases.
Abstract
Local feature-based explanations are a key component of the XAI toolkit. These explanations compute feature importance values relative to an ``interpretable'' feature representation. In tabular data, feature values themselves are often considered interpretable. This paper examines the impact of data engineering choices on local feature-based explanations. We demonstrate that simple, common data engineering techniques, such as representing age with a histogram or encoding race in a specific way, can manipulate feature importance as determined by popular methods like SHAP. Notably, the sensitivity of explanations to feature representation can be exploited by adversaries to obscure issues like discrimination. While the intuition behind these results is straightforward, their systematic exploration has been lacking. Previous work has focused on adversarial attacks on feature-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsShapley Additive Explanations
