Mitigating Membership Inference in Intermediate Representations with Differentially Private Training

Jiayang Meng; Tao Huang; Chen Hou; Guolong Zheng; Hong Chen

arXiv:2602.22611·cs.LG·May 12, 2026

Mitigating Membership Inference in Intermediate Representations with Differentially Private Training

Jiayang Meng, Tao Huang, Chen Hou, Guolong Zheng, Hong Chen

PDF

TL;DR

This paper proposes LM-DP-SGD, a layer-wise adaptive differentially private training method that reduces membership inference risks in intermediate representations while maintaining model utility.

Contribution

It introduces a novel layer-wise risk-aware DP-SGD that allocates privacy protection based on MIA vulnerability estimates, improving privacy-utility trade-offs.

Findings

01

LM-DP-SGD reduces IR-level MIA risk under the same privacy budget.

02

The method preserves model utility better than uniform DP-SGD.

03

Theoretical guarantees are established for privacy and convergence.

Abstract

In Embedding-as-an-Interface (EaaI) settings, pre-trained models are queried for Intermediate Representations (IRs). The distributional properties of IRs can leak training-set membership signals, enabling Membership Inference Attacks (MIAs) whose strength varies across layers. Although Differentially Private Stochastic Gradient Descent (DP-SGD) mitigates such leakage, existing implementations employ per-example gradient clipping and a uniform, layer-agnostic noise multiplier, ignoring heterogeneous layer-wise MIA vulnerability. This paper introduces Layer-wise MIA-risk-aware DP-SGD (LM-DP-SGD), which adaptively allocates privacy protection across layers in proportion to their MIA risk. Specifically, LM-DP-SGD trains a shadow model on a public shadow dataset, extracts per-layer IRs from its train/test splits, and fits layer-specific MIA adversaries, using their attack error rates as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.