FERRET: Private Deep Learning Faster And Better Than DPSGD

David Zagardo

arXiv:2506.05416·cs.CR·June 9, 2025

FERRET: Private Deep Learning Faster And Better Than DPSGD

David Zagardo

PDF

Open Access

TL;DR

FERRET introduces a privacy-preserving deep learning method using 1-bit gradient compression with Bernoulli masking, achieving faster training, better utility, and formal privacy guarantees without additive noise, outperforming DPSGD.

Contribution

FERRET presents a novel MI-DP framework with 1-bit gradient compression and Bernoulli masking, enabling faster training with strong privacy guarantees and improved utility over existing methods.

Findings

01

FERRET outperforms DPSGD in perplexity across multiple models and epochs.

02

FERRET achieves formal MI-DP guarantees without additive noise.

03

FERRET trains up to 5 times faster than DPSGD while maintaining privacy and utility.

Abstract

We revisit 1-bit gradient compression through the lens of mutual-information differential privacy (MI-DP). Building on signSGD, we propose FERRET--Fast and Effective Restricted Release for Ethical Training--which transmits at most one sign bit per parameter group with Bernoulli masking. Theory: We prove each fired group leaks at most ln 2 nats; after subsampling with rate s, the total privacy loss of G groups trained for T steps with firing probability p is epsilon = G * T * s * p * ln 2. Thus FERRET achieves MI-DP for epsilon in [0.1, 2] without additive noise. Practice: We evaluate three granularities--FERRET-MAX (finest), FERRET-EIGHTH (medium), and FERRET-2 (coarsest)--on five LLMs (137M-1.8B parameters) against DPSGD and Non-DP baselines. All methods trained for 1, 3, and 5 epochs. Utility: Across all settings, FERRET-MAX/EIGHTH beat DPSGD's perplexity. At epsilon=0.5, 5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques