Model Inversion Robustness: Can Transfer Learning Help?
Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen,, Ngai-Man Cheung

TL;DR
This paper introduces a simple transfer learning-based defense method called TL-DMI that enhances model inversion robustness by limiting sensitive information encoding, achieving state-of-the-art privacy protection without degrading model utility.
Contribution
The paper proposes a novel transfer learning approach to defend against model inversion attacks, reducing sensitive information leakage while maintaining model performance.
Findings
TL-DMI significantly improves MI robustness in experiments.
The method is simple to implement and does not compromise model utility.
Fisher Information analysis supports the effectiveness of TL-DMI.
Abstract
Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models. Contemporary MI attacks have achieved impressive attack performance, posing serious threats to privacy. Meanwhile, all existing MI defense methods rely on regularization that is in direct conflict with the training objective, resulting in noticeable degradation in model utility. In this work, we take a different perspective, and propose a novel and simple Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI-robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher Information to justify our method. Our defense is remarkably simple to implement. Without bells and whistles, we show in…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
- The paper is well-structured and easy to follow - The topic of the paper is an interesting and important area of research - Even though the proposed method is merely an application of an existing technique, using transfer learning for defending against model inversion attacks seems to be novel
- the different number of parameters are not visible in Fig. 1-IV - Table 3: does only show the proposed defense against no defense and the proposed approach is not compared to existing defense methods - Tables in the appendix are referenced in Section 4.5. This is not a good practice to reference tables in the appendix as if they were in the main part of the paper - from the paper, the whole attack scenario is not quite clear. It is not specified how many classes are attacked and how these clas
- Interesting Defense Approach: The paper introduces a new defense strategy using transfer learning. - Fisher Information Analysis: The paper conducts a unique analysis of layer importance using Fisher Information, providing valuable insights. - Practicality: The proposed defense is straightforward to implement and can be applied to a wide range of model architectures.
- Limited Attack Coverage: The defense's focus on attacks leveraging GANs may not address all model inversion attack types. - Unclear Threat Model: The paper lacks clarity in specifying the adversary's knowledge of critical elements, which affects the practicality of the defense. - Performance Not Satisfying: The defense's performance is questionable, as it doesn't significantly outperform existing methods.
originality: This work proposes an innovative approach to defending against MI attacks, which opts for transfer learning over traditional dependency based regularization methods. The idea is not only novel but intuitively appealing, offering a fresh perspective on the problem. What further strengthens the argument is the incorporation of Fisher information, which provides a robust justification for the proposed method. This logical integration of Fisher information bolsters the paper's overall c
Potential misleading claim: "In other words, MID and BiDO reduce MI attack accuracy by suppressing likelihood P(y|x)." it would be beneficial to clarify whether there is experimental evidence supporting the claim that MID and BiDO reduce MI attack accuracy by suppressing the likelihood P(y|x). If such experiments have been conducted, providing references or details about them would strengthen the paper's credibility. If not, it may be advisable to rephrase this statement as a hypothesis or a po
The work is structured very well, has numerous figures and has a source code attached. The evaluation is quite extensive showing the advantages of the proposed method.
I do, however, have a number of concerns. Firstly I do not see anything scientifically new proposed here. It was shown before that A) certain layers can ‘memorise’ more useful information to aid a MI attacker [1] and B) that reducing the number of layers in the model can reduce the amount of useful information leant and hence limit attacker’s capabilities [2] (which does not universally hold either, but more on that later). It is also important to note that this work claims all of its results a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOil and Gas Production Techniques · Machine Learning and Algorithms · Speech Recognition and Synthesis
