Private Speech Classification without Collapse: Stabilized DP Training and Offline Distillation

Yadi Wen; Tianxin Li; Enji Liang; Rong Du; and Yue Fu

arXiv:2605.02718·cs.SD·May 5, 2026

Private Speech Classification without Collapse: Stabilized DP Training and Offline Distillation

Yadi Wen, Tianxin Li, Enji Liang, Rong Du, and Yue Fu

PDF

TL;DR

This paper introduces a new two-stage privacy-preserving speech classification method that stabilizes training and improves model robustness by distilling a private multimodal teacher into an audio-only student, addressing collapse issues under differential privacy.

Contribution

It proposes a novel two-stage protocol combining DP training and offline distillation to enhance privacy and stability in speech classification without model collapse.

Findings

01

DP training can cause collapse to single-class predictors in imbalanced tasks.

02

The proposed method stabilizes training and maintains privacy guarantees.

03

Offline distillation improves audio-only model performance and robustness.

Abstract

We study example-level private supervised speech classification under a practical release constraint: training may access privileged side information, but the released model must be audio-only. This setting is important because speech systems can often exploit richer side information during development, whereas deployment and release require a lightweight unimodal model with auditable privacy guarantees. Using DP-SGD on the private dataset $D_{priv}$ , we identify a strong-privacy failure mode ( $ϵ \leq 1$ ) on imbalanced tasks, where training may collapse to a near single-class predictor, a phenomenon that overall accuracy can obscure. We therefore emphasize Macro-F1, balanced accuracy, and a simple collapse diagnostic. This failure is especially problematic in our release setting because a collapsed private teacher cannot provide useful supervision for the downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.