TL;DR
This paper introduces RaCO-DP, a differentially private algorithm for constrained optimization in fair machine learning, addressing inter-sample dependencies and demonstrating superior fairness-utility trade-offs.
Contribution
We develop RaCO-DP, a novel DP-SGDA algorithm for rate-constrained problems, with convergence analysis and empirical validation on fairness tasks.
Findings
RaCO-DP effectively handles rate constraints under differential privacy.
The method achieves better fairness-utility trade-offs than existing approaches.
Empirical results demonstrate Pareto dominance in fairness and utility.
Abstract
Many problems in trustworthy ML can be formulated as minimization of the model error under constraints on the prediction rates of the model for suitably-chosen marginals, including most group fairness constraints (demographic parity, equality of odds, etc.). In this work, we study such constrained minimization problems under differential privacy (DP). Standard DP optimization techniques like DP-SGD rely on the loss function's decomposability into per-sample contributions. However, rate constraints introduce inter-sample dependencies, violating the decomposability requirement. To address this, we develop RaCO-DP, a DP variant of the Stochastic Gradient Descent-Ascent (SGDA) algorithm which solves the Lagrangian formulation of rate constraint problems. We demonstrate that the additional privacy cost of incorporating these constraints reduces to privately estimating a histogram over the…
Peer Reviews
Decision·ICLR 2026 Poster
This seems like a very solid contribution. There's been a lot of work on DP optimization in both minmax and fairness settings, and this seems like it contains technical ideas that will be useful elsewhere. The algorithm will surely be an experimental baseline for future work.
I feel the presentation of the paper could be improved. Here are some areas I had particular difficulty: 1. The rate constraints here adapt the Cotter et al. (2019b) definition by extending it to multiclass, but also working with soft decisions. The latter is a major point in their paper and may be an important distinction in the context of fairness [see 1]. 1. We also consider constraints that may look at overlapping parts of the dataset. I am still somewhat confused about this: am I given the
* The paper is clearly written and easy to read. * It has significance as it generalizes prior fairness constraints that previously limited to the binary classification setting to the multiclass setting. It also proposes a novel DP algorithm that leverages a private histogram; empirically, the method achieves a better privacy–fairness trade-off, and the paper also provides a theoretical convergence analysis for the algorithm. * It provides experimental investigations and ablations over different
The method’s performance can degrade substantially when the smallest subgroup is tiny, because DP noise makes its histogram estimates inaccurate. The paper offers only limited discussion of this case.
- Formal convergence analysis of RaCO-DP, - New SOTA claimed to be achieved on standard benchmarks for tabular datasets: CelebA, Parkinsons, ACSEmployment, - Scales to multiple sensitive groups (tested up to 18), - Scales beyond convex models to deep learning model (ResNet16 on CelebA), - Stronger privacy guarantees than former approaches, - Allows to specify directly maximum disparity.
- Results section is not clear, which exactly datasets displayed results achieving SOTA? - Authors didn't run the experiments for SOTA but compared with published data from Lowy et al.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
