PLS-based approach for fair representation learning

Elena M. De-Diego; Adri\'an Perez-Suay; Paula Gordaliza and; Jean-Michel Loubes

arXiv:2502.16263·cs.LG·February 25, 2025

PLS-based approach for fair representation learning

Elena M. De-Diego, Adri\'an Perez-Suay, Paula Gordaliza and, Jean-Michel Loubes

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a novel Fair PLS method that incorporates fairness constraints into representation learning, applicable in linear and nonlinear cases, demonstrating improved performance over fair PCA across various datasets.

Contribution

The paper proposes a new Fair PLS approach that integrates fairness constraints into PLS components, extending to nonlinear cases with kernel embeddings.

Findings

01

Outperforms standard fair PCA in experiments

02

Effective in both linear and nonlinear settings

03

Applicable to multiple datasets

Abstract

We revisit the problem of fair representation learning by proposing Fair Partial Least Squares (PLS) components. PLS is widely used in statistics to efficiently reduce the dimension of the data by providing representation tailored for the prediction. We propose a novel method to incorporate fairness constraints in the construction of PLS components. This new algorithm provides a feasible way to construct such features both in the linear and the non linear case using kernel embeddings. The efficiency of our method is evaluated on different datasets, and we prove its superiority with respect to standard fair PCA method.

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 4

Strengths

* The paper is well-written with a clear introduction. * Fair representation is an important problem, particularly in high-dimensional data where many representation learning methods fail. * The paper utilization of a regularization parameter, that seems to effectively explores the accuracy-fairness front , is a big advantage. * The method is simple and straightforward. * The inclusion of the Equality of Odds constraint is a valuable addition. * The experiments are comprehensive and include eval

Weaknesses

* The paper suggests using Gradient Descent for optimization without discussing convexity. Even empirical testing would be valuable - such as showing convergence over epochs to identify patterns reflecting non-convex problems (like bumps). * Section 4.3 appears disconnected and seemingly unrelated, particularly as it doesn't address all challenges in seq-to-seq fairness problems, only covering the relatively straightforward fact that text can be encoded. * A main concern is that no comparison to

Reviewer 02Rating 5Confidence 4

Strengths

1. The paper is well-written and easy to follow, with a clearly presented mathematical part. 2. Interpretations and general thoughts are provided. 3. A possible relation to fairness in LLM is discussed.

Weaknesses

Although I appreciate the presentation of the work, I have the following concerns. 1. The empirical comparison to previous fair PCA is only tested on one dataset (Adult Income) and provided in the Appendix. It's not convincing the proposed work will outperform previous work in most cases. 2. Although possible application to fairness in LLM is discussed, it's rather superficial. Unless you have done experiments on LLM to measure fairness, I do not suggest this as a separate section in the main pa

Reviewer 03Rating 3Confidence 4

Strengths

This paper studies an important problem of fair representation learning.

Weaknesses

I have the following questions and concerns regarding the contribution and evaluation of the work: 1. **Motivation for PLS-based Framework:** It would be helpful for the authors to clarify the motivation behind selecting PLS as the foundation for their framework. There are various types of approaches for fair representation learning, such as adversarial learning [1], disentanglement [2], and distribution alignment [3]. What are the specific advantages of a PLS-based approach in comparison to th

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning

MethodsPrincipal Components Analysis · Network On Network