PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction

Jonathn Chang; Arya Datla; Ziv Goldfeld

arXiv:2605.06979·cs.LG·May 11, 2026

PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction

Jonathn Chang, Arya Datla, Ziv Goldfeld

PDF

TL;DR

PLOT introduces a transport-based framework for localizing causal variables in neural networks, improving efficiency and accuracy in causal abstraction analysis through progressive refinement and optimal transport coupling.

Contribution

It proposes PLOT, a novel method that localizes causal variables efficiently using optimal transport, enhancing existing causal abstraction techniques like DAS.

Findings

01

PLOT is fast and accurate in localizing causal variables.

02

PLOT-guided DAS achieves DAS-level accuracy with less runtime.

03

Transport-only PLOT performs well across various model complexities.

Abstract

Causal abstraction offers a principled framework for mechanistic interpretability, aligning a high-level causal model with the low-level computation realized by a neural network through counterfactual intervention analysis. Existing methods such as distributed alignment search (DAS) learn expressive subspace interventions, but the relevant neural site is unknown a priori, so finding a handle requires a computationally burdensome search over candidate sites. We introduce PLOT (Progressive Localization via Optimal Transport), a transport-based framework that localizes causal variables from the output effect geometry of abstract and neural interventions. PLOT fits an optimal transport coupling between abstract variables and candidate neural sites, yielding a global soft correspondence that can be calibrated into intervention handles. In simple settings, a single coupling over individual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.