Learning to Generate Image Embeddings with User-level Differential Privacy
Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean, Augenstein, Ting Liu, Florian Schroff, H. Brendan McMahan

TL;DR
This paper introduces DP-FedEmb, a federated learning method with differential privacy controls for training large image embedding models across many users, achieving strong privacy guarantees with minimal utility loss.
Contribution
It proposes DP-FedEmb, a novel federated learning approach with user-level differential privacy for large-scale image embedding models, combining privacy techniques and federated strategies.
Findings
Superior utility under same privacy budget on benchmark datasets
Achieves user-level DP with epsilon<4 while maintaining utility within 5%
Effective for face, landmark, and species image embeddings
Abstract
Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb, a variant of federated learning algorithms with per-user sensitivity control and noise addition, to train from user-partitioned data centralized in the datacenter. DP-FedEmb combines virtual clients, partial aggregation, private local fine-tuning, and public pretraining to achieve strong privacy utility trade-offs. We apply DP-FedEmb to train image embedding models for faces, landmarks and natural species, and demonstrate its superior utility under same privacy budget on benchmark datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
Methodsfail
