Private Rate-Constrained Optimization with Applications to Fair Learning

Mohammad Yaghini; Tudor Cebere; Michael Menart; Aur\'elien Bellet; Nicolas Papernot

arXiv:2505.22703·cs.LG·May 30, 2025

Private Rate-Constrained Optimization with Applications to Fair Learning

Mohammad Yaghini, Tudor Cebere, Michael Menart, Aur\'elien Bellet, Nicolas Papernot

PDF

3 Reviews

TL;DR

This paper introduces RaCO-DP, a differentially private algorithm for constrained optimization in fair machine learning, addressing inter-sample dependencies and demonstrating superior fairness-utility trade-offs.

Contribution

We develop RaCO-DP, a novel DP-SGDA algorithm for rate-constrained problems, with convergence analysis and empirical validation on fairness tasks.

Findings

01

RaCO-DP effectively handles rate constraints under differential privacy.

02

The method achieves better fairness-utility trade-offs than existing approaches.

03

Empirical results demonstrate Pareto dominance in fairness and utility.

Abstract

Many problems in trustworthy ML can be formulated as minimization of the model error under constraints on the prediction rates of the model for suitably-chosen marginals, including most group fairness constraints (demographic parity, equality of odds, etc.). In this work, we study such constrained minimization problems under differential privacy (DP). Standard DP optimization techniques like DP-SGD rely on the loss function's decomposability into per-sample contributions. However, rate constraints introduce inter-sample dependencies, violating the decomposability requirement. To address this, we develop RaCO-DP, a DP variant of the Stochastic Gradient Descent-Ascent (SGDA) algorithm which solves the Lagrangian formulation of rate constraint problems. We demonstrate that the additional privacy cost of incorporating these constraints reduces to privately estimating a histogram over the…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 8Confidence 3

Strengths

This seems like a very solid contribution. There's been a lot of work on DP optimization in both minmax and fairness settings, and this seems like it contains technical ideas that will be useful elsewhere. The algorithm will surely be an experimental baseline for future work.

Weaknesses

I feel the presentation of the paper could be improved. Here are some areas I had particular difficulty: 1. The rate constraints here adapt the Cotter et al. (2019b) definition by extending it to multiclass, but also working with soft decisions. The latter is a major point in their paper and may be an important distinction in the context of fairness [see 1]. 1. We also consider constraints that may look at overlapping parts of the dataset. I am still somewhat confused about this: am I given the

Reviewer 02Rating 6Confidence 3

Strengths

* The paper is clearly written and easy to read. * It has significance as it generalizes prior fairness constraints that previously limited to the binary classification setting to the multiclass setting. It also proposes a novel DP algorithm that leverages a private histogram; empirically, the method achieves a better privacy–fairness trade-off, and the paper also provides a theoretical convergence analysis for the algorithm. * It provides experimental investigations and ablations over different

Weaknesses

The method’s performance can degrade substantially when the smallest subgroup is tiny, because DP noise makes its histogram estimates inaccurate. The paper offers only limited discussion of this case.

Reviewer 03Rating 6Confidence 2

Strengths

- Formal convergence analysis of RaCO-DP, - New SOTA claimed to be achieved on standard benchmarks for tabular datasets: CelebA, Parkinsons, ACSEmployment, - Scales to multiple sensitive groups (tested up to 18), - Scales beyond convex models to deep learning model (ResNet16 on CelebA), - Stronger privacy guarantees than former approaches, - Allows to specify directly maximum disparity.

Weaknesses

- Results section is not clear, which exactly datasets displayed results achieving SOTA? - Authors didn't run the experiments for SOTA but compared with published data from Lowy et al.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.