Beyond Model Interpretability: Socio-Structural Explanations in Machine Learning
Andrew Smart, Atoosa Kasirzadeh

TL;DR
This paper introduces sociostructural explanations as a new interpretability approach that considers social structures influencing machine learning outputs, demonstrated through a racially biased healthcare algorithm.
Contribution
It proposes sociostructural explanations as a novel interpretability framework that incorporates social context into understanding machine learning outputs.
Findings
Sociostructural explanations reveal social influences on model outputs.
Application to healthcare shows social structures impact algorithm bias.
Highlights need for transparency beyond technical model interpretability.
Abstract
What is it to interpret the outputs of an opaque machine learning model. One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model centric local or global explanations, which can be based on mechanistic interpretations revealing the inner working mechanisms of models or nonmechanistic approximations showing input feature output data relationships. In this paper, we draw on social philosophy to argue that interpreting machine learning outputs in certain normatively salient domains could require appealing to a third type of explanation that we call sociostructural explanation. The relevance of this explanation type is motivated by the fact that machine learning models are not isolated entities but are embedded within and shaped by social structures. Sociostructural explanations aim to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
