Say My Name: a Model's Bias Discovery Framework
Massimiliano Ciranni, Luca Molinaro, Carlo Alberto Barbano, Attilio Fiandrotti, Vittorio Murino, Vito Paolo Pastore, Enzo Tartaglione

TL;DR
Say My Name (SaMyNa) is a novel framework that semantically identifies biases in deep learning models, enhancing interpretability and aiding debiasing during training or validation.
Contribution
It introduces the first text-based, semantic bias detection tool for deep models, improving interpretability over existing pseudo-label methods.
Findings
Effective in detecting biases on traditional benchmarks.
Supports model diagnosis and bias disclaiming.
Applicable during training and post-hoc validation.
Abstract
In the last few years, due to the broad applicability of deep learning to downstream tasks and end-to-end training capabilities, increasingly more concerns about potential biases to specific, non-representative patterns have been raised. Many works focusing on unsupervised debiasing usually leverage the tendency of deep models to learn ``easier'' samples, for example by clustering the latent space to obtain bias pseudo-labels. However, the interpretation of such pseudo-labels is not trivial, especially for a non-expert end user, as it does not provide semantic information about the bias features. To address this issue, we introduce ``Say My Name'' (SaMyNa), the first tool to identify biases within deep models semantically. Unlike existing methods, our approach focuses on biases learned by the model. Our text-based pipeline enhances explainability and supports debiasing efforts:…
Peer Reviews
Decision·Submitted to ICLR 2025
- The paper tackles an important problem in machine learning which is bias and spurious correlations, and propose an effective tool to analyse these biases from the endpoint of humans.
- Experimental analysis on bias discovery is lackluster. I think correlation analyses between the proposed method and human annotations are needed. - The efficacy of the method could depend heavily on the model type and alignment of the MLLM or text encoder. I believe there should be an experimental analysis to show the robustness of the method on this matter.
This paper addresses the critical issue of model bias discovery through an interesting approach that utilizes natural language keyword descriptions.
The proposed method lacks novelty since it shares many components with existing literature. For example, the iteration selection based on misclassification confidence outlined in Section 3.1 is a variant of the approach described by Nahon et al. (2023), while the keyword extraction from natural language captions is similar to that found in Kim et al. (2024). Although these references are cited, the paper does not clearly delineate which aspects are novel, making it challenging to assess its orig
This paper is well-written and addresses the critical research question of identifying unknown dataset bias (spurious correlation) within training. This would essentially enhance the explainability and reliability of models in real-world applications especially for safety critical purposes.
**1. Lack of novelty and effectiveness**: Several key ideas of this paper already exist in previous paper [1]. These include 1) sampling keywords using pretrained captioning model, and 2) identifying bias key words. Despite of subtle technical difference, e.g., detecting bias keywords via contrasting true and false positives (this work) or true positive and false negative (Kim et al. [1]), but in overall this paper does not provide any scientific novelty for the same goal beyond the existing pap
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Topic Modeling
