Feature Noise Induces Loss Discrepancy Across Groups
Fereshte Khani, Percy Liang

TL;DR
This paper demonstrates that feature noise can cause loss discrepancy across groups even without information deficiency, affecting model fairness and performance depending on group distribution differences.
Contribution
It reveals feature noise as a subtle cause of loss discrepancy, providing theoretical analysis and empirical validation across datasets.
Findings
Feature noise induces loss discrepancy even with infinite data.
Loss discrepancy persists under distribution shifts that align group moments.
Feature noise impacts groups differently based on their distribution differences.
Abstract
The performance of standard learning procedures has been observed to differ widely across groups. Recent studies usually attribute this loss discrepancy to an information deficiency for one group (e.g., one group has less data). In this work, we point to a more subtle source of loss discrepancy---feature noise. Our main result is that even when there is no information deficiency specific to one group (e.g., both groups have infinite data), adding the same amount of feature noise to all individuals leads to loss discrepancy. For linear regression, we thoroughly characterize the effect of feature noise on loss discrepancy in terms of the amount of noise, the difference between moments of the two groups, and whether group information is used or not. We then show this loss discrepancy does not vanish immediately if a shift in distribution causes the groups to have similar moments. On three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Statistical Methods and Inference
