Linear Model Extraction via Factual and Counterfactual Queries
Daan Otto, Jannis Kurtz, Dick den Hertog, Ilker Birbil

TL;DR
This paper investigates how different types of queries, especially counterfactuals, can be used to extract linear models' parameters, revealing the influence of distance measures and robustness on query efficiency and model security.
Contribution
It introduces mathematical formulations for classification regions based on various queries and derives bounds on the number of queries needed for model extraction under different distance measures.
Findings
Single counterfactual query suffices with differentiable distances
Query complexity grows linearly with data dimension for polyhedral distances
Robust counterfactuals require twice as many queries as standard counterfactuals
Abstract
In model extraction attacks, the goal is to reveal the parameters of a black-box machine learning model by querying the model for a selected set of data points. Due to an increasing demand for explanations, this may involve counterfactual queries besides the typically considered factual queries. In this work, we consider linear models and three types of queries: factual, counterfactual, and robust counterfactual. First, for an arbitrary set of queries, we derive novel mathematical formulations for the classification regions for which the decision of the unknown model is known, without recovering any of the model parameters. Second, we derive bounds on the number of queries needed to extract the model's parameters for (robust) counterfactual queries under arbitrary norm-based distances. We show that the full model can be recovered using just a single counterfactual query when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Privacy-Preserving Technologies in Data
