Community Detection on Model Explanation Graphs for Explainable AI
Ehsan Moradi

TL;DR
This paper introduces Modules of Influence (MoI), a novel framework that constructs explanation graphs from feature attributions, detects feature modules, and analyzes their roles in model behavior, bias, and redundancy for explainable AI.
Contribution
MoI is a new framework that combines community detection with explanation graphs to uncover feature modules influencing model predictions and biases.
Findings
Uncovers correlated feature groups in synthetic and real datasets.
Enhances model debugging through module-level ablations.
Localizes bias exposure to specific feature modules.
Abstract
Feature-attribution methods (e.g., SHAP, LIME) explain individual predictions but often miss higher-order structure: sets of features that act in concert. We propose Modules of Influence (MoI), a framework that (i) constructs a model explanation graph from per-instance attributions, (ii) applies community detection to find feature modules that jointly affect predictions, and (iii) quantifies how these modules relate to bias, redundancy, and causality patterns. Across synthetic and real datasets, MoI uncovers correlated feature groups, improves model debugging via module-level ablations, and localizes bias exposure to specific modules. We release stability and synergy metrics, a reference implementation, and evaluation protocols to benchmark module discovery in XAI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks · Multimodal Machine Learning Applications
