TL;DR
FaceX introduces a novel summary explanation method for face attribute classifiers, enabling comprehensive, region-based interpretability and bias detection across large facial datasets, surpassing traditional pixel attribution approaches.
Contribution
This paper presents FaceX, the first method to provide summary model explanations for face attribute classifiers, improving interpretability and bias detection at a regional level.
Findings
Effective identification of model biases across multiple benchmarks
High interpretability through region-level activation visualization
Robust performance in bias mitigation scenarios
Abstract
EXplainable Artificial Intelligence (XAI) approaches are widely applied for identifying fairness issues in Artificial Intelligence (AI) systems. However, in the context of facial analysis, existing XAI approaches, such as pixel attribution methods, offer explanations for individual images, posing challenges in assessing the overall behavior of a model, which would require labor-intensive manual inspection of a very large number of instances and leaving to the human the task of drawing a general impression of the model behavior from the individual outputs. Addressing this limitation, we introduce FaceX, the first method that provides a comprehensive understanding of face attribute classifiers through summary model explanations. Specifically, FaceX leverages the presence of distinct regions across all facial images to compute a region-level aggregation of model activations, allowing for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
