TL;DR
This paper introduces Hierarchical Network Dissection, a novel interpretability pipeline for face inference models that addresses spatial overlap and global concepts, revealing model representations and biases.
Contribution
It extends Network Dissection to handle face-centric models, enabling detailed interpretation of internal representations and bias detection.
Findings
Models trained for different tasks learn distinct internal features.
The method uncovers biases present in training data.
Hierarchical Network Dissection effectively reveals face model interpretability.
Abstract
This paper presents Hierarchical Network Dissection, a general pipeline to interpret the internal representation of face-centric inference models. Using a probabilistic formulation, our pipeline pairs units of the model with concepts in our "Face Dictionary", a collection of facial concepts with corresponding sample images. Our pipeline is inspired by Network Dissection, a popular interpretability model for object-centric and scene-centric models. However, our formulation allows to deal with two important challenges of face-centric models that Network Dissection cannot address: (1) spacial overlap of concepts: there are different facial concepts that simultaneously occur in the same region of the image, like "nose" (facial part) and "pointy nose" (facial attribute); and (2) global concepts: there are units with affinity to concepts that do not refer to specific locations of the face…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsNetwork Dissection · Hierarchical Network Dissection
