Say My Name: a Model's Bias Discovery Framework

Massimiliano Ciranni; Luca Molinaro; Carlo Alberto Barbano; Attilio Fiandrotti; Vittorio Murino; Vito Paolo Pastore; Enzo Tartaglione

arXiv:2408.09570·cs.LG·October 17, 2025

Say My Name: a Model's Bias Discovery Framework

Massimiliano Ciranni, Luca Molinaro, Carlo Alberto Barbano, Attilio Fiandrotti, Vittorio Murino, Vito Paolo Pastore, Enzo Tartaglione

PDF

Open Access 3 Reviews

TL;DR

Say My Name (SaMyNa) is a novel framework that semantically identifies biases in deep learning models, enhancing interpretability and aiding debiasing during training or validation.

Contribution

It introduces the first text-based, semantic bias detection tool for deep models, improving interpretability over existing pseudo-label methods.

Findings

01

Effective in detecting biases on traditional benchmarks.

02

Supports model diagnosis and bias disclaiming.

03

Applicable during training and post-hoc validation.

Abstract

In the last few years, due to the broad applicability of deep learning to downstream tasks and end-to-end training capabilities, increasingly more concerns about potential biases to specific, non-representative patterns have been raised. Many works focusing on unsupervised debiasing usually leverage the tendency of deep models to learn ``easier'' samples, for example by clustering the latent space to obtain bias pseudo-labels. However, the interpretation of such pseudo-labels is not trivial, especially for a non-expert end user, as it does not provide semantic information about the bias features. To address this issue, we introduce ``Say My Name'' (SaMyNa), the first tool to identify biases within deep models semantically. Unlike existing methods, our approach focuses on biases learned by the model. Our text-based pipeline enhances explainability and supports debiasing efforts:…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 3

Strengths

- The paper tackles an important problem in machine learning which is bias and spurious correlations, and propose an effective tool to analyse these biases from the endpoint of humans.

Weaknesses

- Experimental analysis on bias discovery is lackluster. I think correlation analyses between the proposed method and human annotations are needed. - The efficacy of the method could depend heavily on the model type and alignment of the MLLM or text encoder. I believe there should be an experimental analysis to show the robustness of the method on this matter.

Reviewer 02Rating 5Confidence 4

Strengths

This paper addresses the critical issue of model bias discovery through an interesting approach that utilizes natural language keyword descriptions.

Weaknesses

The proposed method lacks novelty since it shares many components with existing literature. For example, the iteration selection based on misclassification confidence outlined in Section 3.1 is a variant of the approach described by Nahon et al. (2023), while the keyword extraction from natural language captions is similar to that found in Kim et al. (2024). Although these references are cited, the paper does not clearly delineate which aspects are novel, making it challenging to assess its orig

Reviewer 03Rating 5Confidence 5

Strengths

This paper is well-written and addresses the critical research question of identifying unknown dataset bias (spurious correlation) within training. This would essentially enhance the explainability and reliability of models in real-world applications especially for safety critical purposes.

Weaknesses

**1. Lack of novelty and effectiveness**: Several key ideas of this paper already exist in previous paper [1]. These include 1) sampling keywords using pretrained captioning model, and 2) identifying bias key words. Despite of subtle technical difference, e.g., detecting bias keywords via contrasting true and false positives (this work) or true positive and false negative (Kim et al. [1]), but in overall this paper does not provide any scientific novelty for the same goal beyond the existing pap

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Topic Modeling