Neural Networks Decoded: Targeted and Robust Analysis of Neural Network   Decisions via Causal Explanations and Reasoning

Alec F. Diallo; Vaishak Belle; Paul Patras

arXiv:2410.05484·cs.LG·October 10, 2024

Neural Networks Decoded: Targeted and Robust Analysis of Neural Network Decisions via Causal Explanations and Reasoning

Alec F. Diallo, Vaishak Belle, Paul Patras

PDF

Open Access

TL;DR

TRACER is a causal inference-based method that interprets neural network decisions by analyzing input interventions, internal activations, and generating counterfactuals, improving transparency without affecting model performance.

Contribution

It introduces TRACER, a novel causal inference approach that provides structured, interpretable explanations of DNN decisions without altering the model architecture or accuracy.

Findings

01

TRACER outperforms existing interpretability methods in diverse datasets.

02

It effectively identifies feature importance and model biases.

03

TRACER enables creation of compressed yet accurate neural network models.

Abstract

Despite their success and widespread adoption, the opaque nature of deep neural networks (DNNs) continues to hinder trust, especially in critical applications. Current interpretability solutions often yield inconsistent or oversimplified explanations, or require model changes that compromise performance. In this work, we introduce TRACER, a novel method grounded in causal inference theory designed to estimate the causal dynamics underpinning DNN decisions without altering their architecture or compromising their performance. Our approach systematically intervenes on input features to observe how specific changes propagate through the network, affecting internal activations and final outputs. Based on this analysis, we determine the importance of individual features, and construct a high-level causal map by grouping functionally similar layers into cohesive causal nodes, providing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)

MethodsCounterfactuals Explanations · Causal inference