Feature Inference Attack on Shapley Values
Xinjian Luo, Yangfan Jiang, Xiaokui Xiao

TL;DR
This paper reveals privacy vulnerabilities in Shapley value-based interpretability methods used by major MLaaS platforms, demonstrating that private model inputs can be reconstructed through feature inference attacks, highlighting the need for privacy-preserving techniques.
Contribution
It is the first study to investigate privacy risks of Shapley values, proposing two attack models and demonstrating their effectiveness on leading cloud platforms.
Findings
Shapley value explanations can be exploited to infer private inputs.
Attack models successfully reconstruct most private features.
Leading MLaaS platforms are vulnerable to feature inference attacks.
Abstract
As a solution concept in cooperative game theory, Shapley value is highly recognized in model interpretability studies and widely adopted by the leading Machine Learning as a Service (MLaaS) providers, such as Google, Microsoft, and IBM. However, as the Shapley value-based model interpretability methods have been thoroughly studied, few researchers consider the privacy risks incurred by Shapley values, despite that interpretability and privacy are two foundations of machine learning (ML) models. In this paper, we investigate the privacy risks of Shapley value-based model interpretability methods using feature inference attacks: reconstructing the private model inputs based on their Shapley value explanations. Specifically, we present two adversaries. The first adversary can reconstruct the private inputs by training an attack model based on an auxiliary dataset and black-box access to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
