Fair Streaming Principal Component Analysis: Statistical and Algorithmic   Viewpoint

Junghyun Lee; Hanseul Cho; Se-Young Yun; Chulhee Yun

arXiv:2310.18593·stat.ML·October 31, 2023·1 cites

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint

Junghyun Lee, Hanseul Cho, Se-Young Yun, Chulhee Yun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a statistically grounded framework for fair PCA, proposes a memory-efficient streaming algorithm called FNPM, and demonstrates its effectiveness and fairness guarantees on real data.

Contribution

It formulates fair PCA within a new learnability framework and develops the first memory-efficient streaming algorithm with statistical guarantees.

Findings

01

The proposed FNPM algorithm is memory-efficient and suitable for streaming data.

02

Theoretical guarantees show PAFO-learnability for fair PCA.

03

Empirical results confirm the algorithm's efficacy and fairness on real datasets.

Abstract

Fair Principal Component Analysis (PCA) is a problem setting where we aim to perform PCA while making the resulting representation fair in that the projected distributions, conditional on the sensitive attributes, match one another. However, existing approaches to fair PCA have two main problems: theoretically, there has been no statistical foundation of fair PCA in terms of learnability; practically, limited memory prevents us from using existing approaches, as they explicitly rely on full access to the entire data. On the theoretical side, we rigorously formulate fair PCA using a new notion called \emph{probably approximately fair and optimal} (PAFO) learnability. On the practical side, motivated by recent advances in streaming algorithms for addressing memory limitation, we propose a new setting called \emph{fair streaming PCA} along with a memory-efficient algorithm, fair noisy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hanseuljo/fair-streaming-pca
jaxOfficial

Videos

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Face and Expression Recognition · Blind Source Separation Techniques

MethodsPrincipal Components Analysis